Law Firms Enter the Golden Age of Data Mining

12Dec - by aiuniverse - 0 - In Data Mining


As a matter of practice, law firms generate and store incomprehensible amounts of data. Most, if not all, of that data has been digitized and many firms that recognize the untapped value of their data have begun to leverage sophisticated technologies to mine it for reusable work product and valuable insights. While most large law firms with 50 or more attorneys are currently mining data, not all of them are doing it well. Data mining is a relatively new practice in the legal space and the data profiles of firms are highly variable from one organization to another, so identifying the right tools and prioritizing initiatives can be challenging.

While it is likely that data mining could also be highly beneficial to smaller firms, it is much less common among them, primarily because their practices tend to be local or regional. In these firms, there is a tendency to rely on local colleagues and anecdotal information for strategic insight.

Many smaller firms probably should practice data mining; most do not. But it is increasingly clear that large firms must practice it in order to succeed in a highly competitive marketplace, and almost all of them are currently trying to figure out how to do it better.

Primary Use Cases for Data Mining

The most immediate benefit of data mining for law firms is increased productivity. This often takes the form of locating and reusing attorney work product. For example, when a new associate is given a writing assignment and lacks a frame of reference for the topic, she can search internal work product for “model documents” like contracts or briefs or motions that have proven to be effective to minimize the time spent on research. She may need to Shepardize previously used citations through the Shepard’s Citations Service to determine if they’re still “good law,” but this mitigates having to recreate the wheel.

Litigation attorneys can search their historical data for names of attorneys, law firms, parties, judges and expert witnesses that others in their firm have faced previously to gain greater insights and develop effective strategies for the matter at hand. This minimizes the need to send those anecdotal “what can you tell me about …” emails to the entire firm. Legal analytics platforms can do this much more effectively by searching all PACER data, but for those that do not use legal analytics mining your own firm data is your best bet.

In the context of M&A transactions, attorneys may use data mining technology to find recent work the firm has done in transactions similar to the current one, and then perhaps extend that insight by looking at public data in the SEC’s EDGAR database. Mining your firm’s own data may turn up covenants or clauses — a “gold in the backyard” provision, for example — from recent deals that are directly applicable to the current one. Successful data mining in this context often depends on conceptual rather than full-text searching to find specific deal points, and firms who do a lot of transactional legal work and deploy data mining technology for that purpose will need to ensure that their data is appropriately tagged for conceptual searching.


The primary tool needed for data mining is a good document management system (DMS). Every large law firm has one — whether internally-developed or through a service provider. However, the effectiveness of a DMS is only as good as the tags and metadata that the users provide. If a document is not properly tagged and annotated with the correct metadata, including date, matter, client, practice and authors, it could be overlooked at a later date — perhaps when it is needed most.

Of course, there are supplemental ways to fill in metadata gaps — for instance, if someone other than the author filed a doc and used his or her own name, searching the firm’s timekeeping system can find out who actually wrote it. Integrating applications such as DMS and timekeeping applications can make this process easier and faster while generating interesting and useful data.

In a similar vein, some large firms are now deploying enterprise search technology that can tap into every application operating within the firm, capture and structure that data, and provide granular insight in a single, integrated interface. While powerful, these systems can only search for information or metadata that had been previously entered into the system.

There are also search enhancement technologies that “crawl” through the DMS and automatically captures metadata. Some automatically apply conceptual labels to reduce reliance on full-text searching, while others capture and derive metadata, such as jurisdiction, governing law, and individual lawyers, firms, judges and expert witnesses, to make documents more “intelligent.” Advances in artificial intelligence are making automated metadata capture increasingly practical and affordable.

Finally, some systems can also enable integrated searching of both internal and external data within a single interface. This is perhaps the holy grail of data mining for law firms.

Controlling Access

It is imperative that firms deploying data mining technology carefully think through security protocols. Most data in law firms can’t be accessible to everyone. A financial services client, for instance, may have an engagement letter stipulating that only individuals in the firm that are working directly on a matter can access the related documents. Access controls are typically configured through the firm’s DMS. Most firms will require both matter-level and document-level security. The practical result of this is that search results will vary considerably across individual users according to their access rights. In some cases, firms seeking to make “model documents” widely available may be able to provide broader access to matter-related documents by redacting certain information from them. Whatever the circumstances, firms need to ensure that the technology they are using provides granular controls over access.

Data Security and the Cloud

While security concerns about storing data in the cloud have eased considerably in other industries, many law firms are still reluctant to deploy cloud technology. This is partly a cultural phenomenon whereby the lawyers lag behind other industries in technology adoption, but it can also be client-driven: Some clients may strictly prohibit firms from storing data off-premises. That said, we are beginning to see less resistance to the technology, which can be a difference-maker in the context of data mining.

The cloud offers superior accessibility to data and offers huge advantages in flexibility and scalability. As analytics technology improves, the cloud also enables the kind of data integration perhaps best exemplified in knowledge graphs, which help users quickly and visually “connect the dots” from disparate data sources and access insights that were previously unattainable. These are powerful capabilities that some attorneys are already using to make develop data-based legal strategies and win cases in litigation, and to make better decision in the business of law.

It is no longer controversial to claim data is safer in the cloud than behind a firewall. Major cloud technology providers like Amazon Web Services offer more comprehensive and robust security than firms can provide in local environments. It’s simply too complex and too expensive for firms to provide the same level of security on their own. On the other hand, clients still drive firm behavior. A kind of compromise solution that is increasingly common in the legal world involves delivering applications via the cloud which access data either from on-premises servers or perhaps a private cloud.

Data Retention and Its Implications for Data Mining

While there are important liability and risk considerations that firms need to take into account when creating data retention policies — considerations that are beyond the scope of this article — firms need to understand that having too much data on hand can create serious performance issues and undermine the utility of data mining. Regularly cleaning out the “data closet” of duplicate files and “junk,” like fax cover sheets or non-matter-related emails or files, ultimately makes repositories more responsive and useful. Firms that are serious about data mining should take the time to develop protocols for removing old and irrelevant data that at the same time comply with regulations and adhere to internal policies.


Data mining promises to be transformative in firms, but if the system is slow or unreliable, attorneys won’t use it and the investment will be largely wasted. That would be a shame, because the benefits of the technology can extend to virtually every workflow in the law firm. Wherever firms use applications that accumulate data, mining technologies have the potential to increase productivity, deliver new insights and help them stand out in an increasingly competitive marketplace.

Facebook Comments