Skip to content Skip to navigation menu
Your browser is not supported by this site.
Please update to the latest version, or use a different browser for the best experience.

Corporate Counsel Connect collection

December 2015 Edition

Important considerations for mid-range e-discovery data collection

Kyle Sparks

computersExecutive Summary

Data collection during e-discovery is critically important because a significant number of court sanctions are the result of inadequate or improper data collection. Here are two examples:

  • In the case of Peerless Industries, Inc. v. Crimson AV, LLC, the plaintiff requested various documents that were held by Sycamore Manufacturing Co., Ltd., a sister company of Crimson. However, Crimson told the court that they had delegated the data collection process in this case to their vendor and they assumed that their vendor had instructed the Sycamore staff on how to collect documents, but neglected to verify that this was actually the case. Since Crimson was not able to collect the requested information and could not answer basic questions about their IT infrastructure and other key issues, the court sanctioned Crimson and ordered the company to “show that they in fact searched for the requested documents and, if those documents no longer exist or cannot be located, they must specifically verify what it is they cannot produce.”
  • In Procaps S.A. v. Patheon Inc., two opinions were handed down by the court, largely in response to poor data collection practices and protocols in the case and inadequate support from the plaintiff’s counsel. The first opinion focused on the plaintiff’s decision not to implement a litigation hold, inadequate communication between counsel and data custodians in Colombia (where the plaintiff was based), and a failure to assist data custodians who were charged with the collection of Electronically Stored Information (ESI), the results of which were data searches that the court deemed “inadequate.” Sanctions in the case included payment of the attorneys awarded in the case, as well as the costs of hiring an independent third party to conduct a forensic examination of Procaps’ data stores.

Clearly, improper data collection can result in potentially significant sanctions.

Important issues in ESI data collection

Courts tend to approve of “casting a wide net”

There are two basic approaches to collecting data during an e-discovery exercise:

  • Collect everything that is available, including emails, files, text messages, social media posts, and anything else that in some remote way might be relevant; and then use various tools and human review to cull the collected data.
  • Use a more restrictive approach to gathering data, collecting only what will reasonably be deemed necessary to satisfy the requirements of the e-discovery order.

Courts tend to approve of the “casting a wide net” approach to data collection because it provides the assurance that a party is collecting everything that might potentially be relevant. Attorneys also often favor this approach because it saves them time by reducing the amount of effort required to gather data and because gathering everything possible is often faster than taking a more selective approach.

The reality is that it’s better to do the opposite

The best practice for data collection is still to collect a large amount of data, but also to cull it so that only relevant data remains during e-discovery. For example, instead of making forensic copies of the contents of a large number of hard drives, it is more advantageous to produce only the relevant content from these hard drives through appropriate culling processes. While there are some limited situations in which courts seek production of very large quantities of data, this is not the norm.

The primary advantage of collecting information in a more focused manner is that it saves substantially on attorney and paralegal costs and processing fees since there is less information to examine during document processing. Content hosting fees are also lower because less data is stored during the litigation process. Moreover, given the tighter timelines that will be imposed on e-discovery under the new FRCP amendments going into effect in December 2015, minimizing the amount of data collected may offer advantages when attempting to work within the more restrictive timeframes that will be imposed.

The majority of e-discovery cases do not generate enormous amounts of data, however, it is essential to keep in mind that:

  • Data must still be collected properly regardless of the amount of data that must be collected. It is essential that the method of data collection chosen be forensically sound – i.e., that the ESI collected is not modified in any way and that a proper chain of custody can be established for the collected data.
  • The focus in any data collection must be on preventing spoliation of data, including missing relevant data sources during the collection phase or somehow altering the data that is collected.
  • The ultimate goal of any data collection effort is to minimize the risk that can be created by the collection process itself.

There is wide variability in organizations’ technology proficiency

Most smaller organizations generally do not have the technology proficiency or specialized skill sets required to adequately address the various data collection issues involved in e-discovery. This not only tends to drive up the costs of data collection, but it also increases the risk of over- or under-collecting data, spoliation of data, or data being rendered inadmissible.

Best practices for data collection

There are several best practices that organizations should consider when addressing mid-range data collections.

Assemble the right personnel for data collection

First and foremost, a team with the right knowledge and skill set is key to reducing risk in data collection. The point person should have a strong background in IT because some of the content that may need to be collected will be from sources that require more specialized collection skills, such as proprietary CRM systems, Microsoft SharePoint, or databases.

The importance of having an IT staff member as the data collection lead who is skilled in finding and collecting ESI cannot be under estimated. For example, in the case of Green v. Blitz USA, Inc., the manager that was put in charge of the defendant’s data collection efforts described himself as “about as computer ... illiterate as they get.” While there are risks inherent in self-collection, these risks can largely be mitigated if the leader of the collection effort is technically competent.

Ideally, if the resources and personnel are available, a team consisting of IT, legal, and business staff members should be assembled to manage the data collection process. These skill sets will permit a more thorough understanding of what is being collected and the relevance of the collected data to ensure further mitigation of risk in the collection process. While many in the legal profession are opposed to organizations’ self-collection of data during e-discovery, having a team of competent professionals with the right technical skills can mitigate much of the risk during data collection.

Create a data map

The next step should be to create a data map that will help to inventory corporate data and identify the location and type of all data that may be subject to collection. The benefit of a data map is that it can guide data collectors and speed the data collection process. Moreover, it can also satisfy a court’s requirement that an organization make a good faith assessment of where all potentially relevant data is located.

In an ideal world, creating a data map would be a relatively simple exercise, but it won’t be in many organizations. Potentially relevant data can be found on corporate desktops, laptops, mobile phones, and tablets; corporate email systems; SharePoint and other collaboration systems; employee-owned laptops, mobile phones, and tablets; employee-managed file sync and share solutions like Dropbox; corporate file shares; USB drives; and corporate- and employee-managed cloud storage and backup systems. Data types can include email, files, text messages, social media posts, photographs, and a wide range of other data types.

There are two challenges inherent in creating a data map. First, when data is distributed across an organization and among many different platforms – only some of which are under IT’s control – data collected for e-discovery is a moving target and can be difficult to find. Second, some data may be difficult to locate at all. For example, a corporate business record created by an employee on his or her personally owned tablet and saved to a personal file sync and share tool may be “invisible” to those charged with collecting data for e-discovery. However, it is essential to collect data from all relevant sources, even those that are under the control of individual employees. For example, in the case of Small v. University Medical Center of Southern Nevada, the special master assigned to the case recommended a default judgment in favor of more than 600 plaintiffs because data from personally owned mobile devices, among other data sources, was not retained properly by the defendant.

Ensure that metadata is preserved

It is essential to collect data properly so that metadata is preserved throughout the data collection process. Metadata – which is data within files that provides information about these files, such as the author and last accessed date – is an essential element that must be retained intact and unmodified during the data collection process in order for information to be defensible. For example, simple drag-and-drop of data during the collection process can alter the metadata of the copied files, potentially rendering the data inadmissible for e-discovery. So important is metadata in the context of discoverable information that the Supreme Courts of Arizona and Washington State have determined that metadata must be retained as part of the information that an organization archives.

Focus on the “low-hanging fruit” first

Another best practice is to concentrate first on the “low-hanging fruit” – the repositories that contain the largest volumes of data that will be relevant during e-discovery. In most organizations, this will include corporate email systems (which in most organizations will be Microsoft Exchange on the backend and Outlook at the desktop or laptop) and employees’ personal directories on their hard drives. Email systems are typically the largest single repository of corporate business records in most organizations, largely because the typical information worker spends at least 150 minutes per day doing work in their email system. One best practice as part of the data collection process can include extracting necessary content into .pst files or equivalents for loading into review platforms, although other repositories must also be processed.

Summary

Data collection is an essential element of the e-discovery process because of the important ramifications it can have on the admissibility of evidence and the mitigation of risk during litigation. Organizations involved in mid-range data collection efforts should take special care to follow appropriate best practices so that collected data is defensibly gathered, the costs of data collection are kept as low as possible, and risk is minimized.


About the Author

Kyle Sparks is a CEDS Certified Speaker. Kyle’s 22-year career in the legal discovery profession has traversed firm and vendor leadership roles. From paper discovery in big tobacco litigation to building a litigation support department focused on e-discovery for an Am Law 200 firm, Kyle has obtained a comprehensive understanding of the discipline. Serving as an IT and lit support manager has provided a wide scope of industry software and legal knowledge. Today, as a Senior e-discovery Specialist and subject matter expert for Thomson Reuters, Kyle specializes in educating clients on all phases of the EDRM model as well as rules of civil procedure.


A NEW AGE IN EDISCOVERY SOFTWARE

Thomson Reuters eDiscovery Point

Request a Live Demo


article