top of page

The New Role of ‘Legal Data Translators’: Navigating the Intersection of Law and Data Science

By Amanda Catharine Chaboryk.

Artificial intelligence (AI) is accelerating at a rapid pace, while the regulation and policy governing it is trying hard to keep up globally. To speak to the advancement of Generative AI (Gen AI) in 2023, The Cambridge Dictionary, has named “hallucinate” its word of the year, signalling its growing importance and impact. One of the reasons Gen AI is more accessible than other forms of AI is due to the availability of pre-trained models that can be fine-tuned for specific tasks, resulting in a growing interest in its potential applications. Pre-trained models can be adapted to bespoke legal tasks, such as contract analysis, making Gen AI prime for optimising efficiency in the legal industry. Legal, heavily reliant on language, has demonstrated great promise for the transformative potential of large language models (LLMs). The technology has shown the ability to disrupt the core foundation of the legal market, from the democratisation of legal advice to optimising due diligence in deals.  For these opportunities to be seized, a multidisciplinary approach is required, comprising dynamic teams of developers, data scientists, and lawyers.  In addition to these subject matter experts, is the growing necessity for a ‘legal data translator’ – serving as the ‘decoder’ bridging the gaps between legal and technical brilliance.

The role of a ‘data translator’ has received progressive coverage over the years alongside the adoption of digitisation, particularly cloud computing.  Covid-19 vastly increased the demand and reliance on online platforms to enable remote working and sustained operational resilience. Gartner for example, reported that worldwide end-user spending on public cloud services grew up 18.4% in 2021, totalling $304.9 billion. Cloud computing serves as an enabler providing integration across different domains and the adoption of emerging technologies, such as Generative AI. All these unique factors have necessitated the role of a ‘data translator’ – a conduit (or ‘decoder’ if you like), between data scientists and Subject Matter Experts (SMEs).  Data translators are skilled at understanding the commercial needs of an organisation yet are data savvy enough to ‘decode’ technical requirements into a comprehensible manner. They may not precisely be ‘full-stack developers’, but are they have a strong level of ‘data literacy’ and can distil the commercial meanings from the data (provided by the data scientists). Data translators bridge an important gap and serve as conduits for converting data to information, and information to knowledge.  

Harvard Business Review (HBR) detailed a mere five years ago that companies need to look beyond data scientists to succeed with analytics and AI, noting the requirements for diverse and agile teams that comprise data engineers, data architects, and most importantly translators. MIT Sloan’s extensive research in 2016 also yielded similar findings – citing the consistent disconnect between data scientists and the executive decision makers they support.” This disconnect is bridged by data translators, joining the technical expertise of the data stakeholders, with the operational expertise of an organisation’s diverse SMEs. Let’s apply this to the legal setting, namely a law firm, containing finance, legal, knowledge management (KM), and business development experts. All these teams and SMEs produce large volumes of data that are vital for the key decision makers and the other groups.  A challenge to consider however, is that law firms produce a mass volume of raw data that isn’t created with analysis in mind, compared to other industries such as health care and insurance that have been utilising big data for decades. These data sources include enterprise resource planning (EPR) software for financial management, customer relationship management (CRM) platforms manage BD and sales, and many others. Receiving an excel file with thousands of rows of financial data, isn’t inherently valuable unless it’s ‘translated’ into a story, providing the key findings.  What is of value, however, is converting the raw data into insights - identifying the most profitable matters, write-offs trends, and practice areas experiencing the greatest growth. The CRM data, once structured and analysed by a ‘translator’, can help the business understand their clients and prospects better, and measure the ROI on BD and marketing activity. A ‘legal data translator’, will help ensure that the data generated through the firm’s data sources, translate into insights that are actionable by key stakeholders.  As eloquently stated by Gu Jifa, systems scientist, and Professor, “data is the most basic level; information adds context; knowledge adds how to use it, and wisdom adds when and why to use it.” This precisely illustrates the importance of this new emerging role of a ‘legal data translator’ – bridging the gap between the domains of their legal and data expertise.

There are a wide variety of examples in the legal setting that demonstrate the importance of dynamic teams, containing SMEs, data experts, and ‘translators.’ This largely came to fruition last year - the year of ‘Gen AI’ - with law firms coming to grips with the transformative potential in the legal sector.  The journey naturally started with exploring and experimenting with Large Language Models (LLMs) and Generative Pre-Trained Transformers (GPTs). GPTs are a subset of LLMs that use a specific technique (transformer) and method (pre-training) to achieve high performance and versatility in natural language generation and understanding. Breaking this down, a ‘transformer’ is a model that can process natural language by using attention – which means it can focus on the most relevant parts of the input and output. ‘Pre-trained’, denotes that that model has been trained on a very large amount of text from an abundance of sources (books, websites, etc). Finally, ‘generative’, means the model can use its pre-trained knowledge to generate new texts that are different, clear, and relevant to the input (or context).  

The emerging technology necessitated firms assembling a clear strategy and guidance on how colleagues should engage, assess use cases, and determine what safeguards needed to be put in place. This involved (and involves, as some firms are starting this journey now), ‘taskforces’ of technical and legal experts.  To make decisions around risk and product selection, General Counsel (GCs) worked with IT, to sign off the use and safeguarding policies.  IT and software engineers provided technical expertise on the design and implementation of AI systems. Innovation teams engaged with different practice areas (such as disputes and corporate), to identify use cases and determine the quality and accuracy of outputs.  BD teams managed engagement and marketing, focusing on articulating the firm’s strategy externally and to clients. Alongside this activity, lawyers worked hard to help support clients on the legal considerations associated with purchasing, outsourcing, using, and even developing Generative AI tools. All these respective working groups and ‘task forces’ involved ‘translators’ to create a common understanding (‘translation’), between the legal and data technical domains.

Law firms have actively been advising clients on the legal implications and risks around AI, covering key legal aspects such as Intellectual Property (IP), Data, Regulations and Compliance. This is particularly prevalent in sectors such as financial services, healthcare, and consumer protection, where AI applications may be subject to specific rules, standards, and oversight. AI and data protection are naturally closely intertwined, as AI often relies on large amounts of personal data to function (and can also generate new or copied personal data that may not be transparent or accurate). Data protection lawyers need to understand AI to effectively advise – whether its helping clients to navigate regulatory frameworks or support to identify and assess the data protection risks that AI poses, such as the purpose, legal basis, and scope of data processing. On the contentious front, data protection lawyers and litigators, will need to represent and defend their clients in the event of data protection disputes, complaints, or investigations which can arise from the use of AI (whether brought by data protection authorities, data subjects or other stakeholders). With the nature of this complex and relevantly new type of advisory – legal technical expertise needs to be translated into digestible and actionable guidance.  Legal data translators, however, aren’t just required in law firms – they are also required in policy development and governance.

Cross-collaborative teams, with complementary skills sets are necessary to cultivate collaboration, and continue to further build on complex AI policy and regulations. Central to advisory is staying abreast of vastly developing AI regulation, such as the recent agreement on proposed legislation for AI regulation in the European Union (EU).  The EU AI Act, at its core, will regulate AI systems based on their risk level, ranging from unacceptable to minimal/no-risk, with stricter rules for the higher-risk categories.  On the policy development front, in early November President Biden announced the Executive Order on Safe, Secure and Trustworthy Artificial Intelligence, consensus on International Guiding Principles on AI was reached by the leaders of the G7, and the Bletchley Declaration was published on the opening day of the AI Safety Summit. The composition of brilliant dynamic teams, such as leading ‘AI Councils’ and working international groups, comprise a range Subject Matter Experts (SMEs), which include full-stack developers, lawyers, policy makers, scientists, civil servants, and engineers. What these teams also include is a ‘decoder’ or ‘translator’, converting commercial requirements into technical requirements (and vice versa).  Further, cross-collaborative teams, with diverse skills sets are necessary to foster collaboration, and continue to build on complex AI policy and regulations.

A further tangible example of combining the skillsets of lawyers and developers is ‘legal prompt engineering’, a relatively new concept and skillset.  “Prompt engineering” involves tactics for getting improved results from LLMs, such as writing clear instructions and breaking up complex tasks into simpler components (especially those that are subjective). “Legal Prompt Engineering” involves creating prompts, or specific language cues that can be used in legal tasks, effectively applying principles of prompt engineering to legal documents. As legal language is highly specialised and often contains complex concepts and terminology, prompting is vital to maximise outputs.  Legal prompt engineering as such requires legal expertise, natural language processing techniques, and data evaluation methods. Data evaluation methods are required to assess and improve the prompt’s quality and reliability and address any errors or gaps in the model’s output. This is yet another example of the unique skillet that involves bridging the gap between legal concepts and data methods, using the knowledge of both domains to solve problems.

Bridging the gap between data and the law has been a long-time industry need, for ensuring that legal innovation is driven by evidence (data) and ethics. Legal data translators are a new and vital role in the legal sector, where they use data methods to solve legal problems and challenges. They are not only legal experts, but also skilled communicators who can demonstrate the value of analytics and AI aligned to business goals. So where does one find a ‘legal data translator? Far and wide! They can be found in various settings, such as law firms, courts, academic institutions, and government agencies, where they help to bridge the language gap between data and law.  


About the Author

Amanda Chaboryk is the Head of Legal Data and Systems within Operate, for PricewaterhouseCoopers in London.  Her professional focus has been on legal project management, litigation finance, and technology. She now focuses on leading the operational delivery of complex managed legal programmes and helping clients navigate emerging technologies. #AmandaChaboryk #data #law #translators #communicators #analytics #AI #legaltech


bottom of page