Category Archives: Theory

The new Sedona publication

I will have a detailed and reasoned critique of The Sedona Conference® Commentary on Achieving Quality in the E-Discovery Process out as soon as possible.

I think this is pretty important stuff, so I want to take the time to do it properly.

Until then, read the work done by Tobias Mayer.

It will give you a VERY good idea of why the Sedona piece doesn’t work for me and why it’s pointing decision-makers in the wrong direction for all the right reasons.

Thanks to Ralph Losey for flagging it! I might not have read it for awhile had it not been featured on his blog with all those pictures of people I do NOT want to emulate!

Cataphora…Yummy! Part I

I’m in love with H5 and not just because of their killer website and name that has more cowbell than I can handle, but I also am having special moments with Cataphora.

In the run up to the Sedona Conference, I’m taking another look at legal tech from the perspective of what lawyers really need and I’m pretty sure that the teeny weeny email application I’m writing (prototype due out at the end of the summer) is on the right track.

That being said, Cataphora really does have the right approach and are FIXING IT! “It” being the way search and retrieval of meaningful content is and ought to be done.

So, understand their mission, if you choose to accept it (and you should):

Defining principle

Cataphora’s success and very existence are based upon one defining principle. This arose from a conceptual breakthrough that was simply stated, yet proved to be radical in its practical effects. This idea was that, in order to truly understand a document, you have to know about the circumstances in which it arose. In other words, you have to understand its context.

The negative

Ok, they’re a bit … oh, I don’t know whiny and snarky,

Trustworthiness is a core value in the legal marketplace and at Cataphora. We strongly recommend examining all vendor claims carefully. One way to do that is by looking at how their website used to look. The Wayback Machine makes that easy – just go here and enter the URL for the vendor in which you are interested.

I mean, really! Who CARES what a website looked like in the Wayback Machine?! Is this really part of the E-DISCOVERY dog/pony show to which unsuspecting clients are subjected? Didn’t think so.

If that’s the standard, then most legal tech projects would be doomed, because when some of us were using computers in litigation, some others of us were still in high school counting spots. And, of course, some others of us were practicing law and making googobs of money (ahem!). Besides, an appearance on the Brewster Kahle show is not really an indicator of algorithmic quality. It just means you’re lucky, really smart or have your own private Tardis.

OK, so I don’t think Cataphora folks are lucky. I think they’re really, really smart. Buuuut, it’s not exactly rocket science. Maybe it was 10 years ago, but not any more. Ever looked at Digg Labs?

Two Patents? Hmmm. Gotta think about that one.

I’ve read both, but now I’d better look at the pictures, because this is one area where I’m pretty sure there’s so much out there now that these patents may not hold much water. Not a huge fan of business method patents anyway, but when they involve stuff that seems to be open-sourced to the hilt, it gives me pause.

I could be so incredibly wrong, so I’ll take another look, but at first blush, what they write about on the site seems right out of Collective Intelligence, a book I keep next to the bedside, cuz I’m that much of a dork.

Back to the Positive

But, Elizabeth Charnock, the founder of Cataphora is a much bigger dork, said with all sorts of love, so I think worship might still be in order.

Plus, she’s a girl, and that makes her AWESOME! And, a little scary.

I love Cataphora because….

they “get” the wisdom of letting computers do what they do best. And, computers don’t really care whether you have 1 or 2 terabytes. Which means that you can leave your data unculled, and the computer will keep chugging along.

Not only that, but once the data has been marked as “non-responsive,” it can still be used for all sorts of things. Like weighting. And, making your useful dataset searching smarter. Wanna know how? I bet you do!

Call me! No, really: call me.

So, let’s get real: [ bold statement ] there’s absolutely NO real reason to cull ESI ! [ /bold statement ]

I found this really HYSTERICAL presentation the other day entitled The Real Cost of Privilege Review ( and here) and all it did was make me think of lollipops. Read it and weep.

I want to know who out there is wasting soooo much money…so I can sign them up as clients, because this sort of process is so yesterday, even your 10-yr old could probably re-engineer it to get better and more cost-effective results. I mean, hasn’t anyone ever played pick up sticks???

So, why am I adamant that people stop culling? Because it’s like trying to speak French without any understanding of grammar.

To be more precise, I’m advocating that lawyers stop culling at the first tier, as if their lives depended upon a massive reduction in terabytes. I’m suggesting that culling ought to be done by the computer, and that valuable metadata (in a really, really board sense) ought to be retained until the end of the project. I’m intimating that the document corpus is a body and it’s integrity depends upon the entirety of its members.

In other words, in Cataphora-lingo:

Cataphora is the first and only provider to develop deep analytics (not mere data statistics or simple email widgets) that give insight into the facts expressed by the ESI dataset. True analytics can (among many other things) detect individual and organizational “heartbeats” and de facto organizational substructures, evaluate typical versus anomalous behavior, assess consistency and variation in an organization’s processes, and detect patterns of data deletion.

If you’ve got lawyers doing the culling and searching, here’s yer sign: you’re going about it the wrong way. It’s like taking a hachet to an old growth forest of oak with no appreciation for the vital role played by acorns.

I say, hire yourself a legally-trained person who knows about taxonomies and understands the difference between DATA, INFORMATION and ARGUMENT.

Lawyers, do not. Not unless they’ve taken “Data 101.” They usually work bass ackwards and try to squeeze everything into theory, instead of first trying to understand what they’ve got.

This is where Cataphora’s mission is key: understand the forest before you start cutting trees.

Thus, sayeth the Mighty Snarker, thereby ending-eth the lesson!

TRECing along

Found an incredible site today, the H5 site.

It’s a lot easier doing this when you have 30 industrial-strength brains to my 1 eco-friendly noggin, but OK, I’ve never been put off by a [ huge freakin’ ] challenge.

[sigh]

Anyway, through their site, I found the site I’d been looking for, which talks about information retrieval from large data sets. Thank goodness I was particularly curious today, or I’d have NEVER found it! …Well, I would have, but given that I’ve been looking nearly a year, that says something about either it or me.

What was I looking for? Well, it’s the TREC site. Looking for it too? Well, here ya go!

The Text Retrieval Conference (TREC) workshop series encourages research in information retrieval and related applications by providing a large test collection, uniform scoring procedures, and a forum for organizations interested in comparing their results. Now in its eighteenth year, the conference has become the major experimental effort in the field. Participants in the previous TREC conferences have examined a wide variety of retrieval techniques and retrieval environments, including cross-language retrieval, retrieval of web documents, multimedia retrieval, and question answering. Details about TREC can be found at the TREC web site, http://trec.nist.gov.

You are invited to participate in TREC 2009. TREC 2009 will consist of a set of tasks known as “tracks”. Each track focuses on a particular subproblem or variant of the retrieval task as described below. Organizations may choose to participate in any or all of the tracks. Training and test materials are available from NIST for some tracks; other tracks will use special collections that are available from other organizations for a fee.

Might be a way to get invited to Sedona…can I get a proposal together for the Legal Track in like 4 days? Hmmmmmm…

Got 42 billion legal documents?

I didn’t think so.

So, I totally agree that we can do away with the Bates system for identifying unique documents in litigation and move towards hashing them instead.

Here’s why:

Question: you’re ITM and you’ve been sued 1,000 times. Probably not an exaggeration.

How often did you get a discovery request for your organizational chart? Probably about 500 times.

Which means that upwards of 500 cases, 5 people on YOUR side [ 2 lawyers, 1 paralegal and 2 document reviewers ] touched that document, meaning 2500 external touches for one document. Even at $ 1.00 per touch (HA!) that’s $500.00 per document over it’s litigation life.

Add in the internal touches, court touches and appellate level touches and I sure hope it was full of pretty colors, because if you’re ITM and you LOST that lawsuit, you’ve just paid double.

This is a technical problem, not a legal one, although the impact upon the legal/litigation community could be severe. What this means is that software developers MUST figure out a better way. Because we can.

This is a great playground to be in, –especially given the economy…

So, in looking at TinyUrl, I wondered why they only used 6 slots for their hash.

For purposes of the guess, a “close enough” estimate of how the algorithm works would be to look at the possible items that go in each slot [a-z] and [0-9] and see how many variations were available, mathematically speaking.

How a hash is generated isn’t important, because there are several well-defined ways of doing so. The only thing that really matters is that the generator interface check that a hash has not already been used and generate another one in the teeny number of exceptions when a double is created.

The math of it is fairly simple, so I’ll spare you the link clicking to Chemical-Ecology and simply give it to you:

a to z is 26 letters plus 10 digits for a total of 36 items in each slot. There are 6 slots, so the formula for this is

36P6 = 36! / (36-6) = 1,402,410,240

Then, I wondered how the total would change if I got greedy and added oooone more slot.

The formula becomes:

36P7 = 36!/(36-7) = 42,072,307,200

That is a LOT of documents. Which means that IBM might be able to fill the bucket, but very few other companies will. In fact, the Forbes top 200 might want their own buckets, which would STILL be better than the current Bates system

[ mathematical corrections are always welcome! ]

Now, according to my Outlook text files, they send the potential variations through the rook with a text version format of [ 76193731-000000DGC.eml ]

Soooo, if we then ask how likely it is that the courts will be deluged with documents on the order of 42 BILLION, then I think it’s safe to say that it makes sense to pile all documents into one bucket and assign them unique litigation numbers rather than have each party Bates stamp their own.

The potential ROI for companies that are repeatedly sued (Forbes top 1000?) is impressive. Can you imagine how much money clients would save, if 90% of discovery documents were found to have already been assigned a number??

It’s a simple matter to ensure that the numbers are truly unique, so the next step would be to tag them appropriately. In fact, companies could keep repositories forever and provide an API to specific discovery requests, rather than actually delivering discovery.

Something to think about in these days of cloud computing!

Oxo as a Model for Law 3.0

I was thinking today about the prospect for a program that refits the homes of the elderly (while they’re off visiting the grands or something) so that they can be self-sufficient longer.

A very good reason to move to Florida.
But, then I started thinking about Oxo products (which I use and love). Since I think they had a great, simple idea, I decided to look at their mission statement, as a way to take a working-break from devising the mission statements for my own projects. This is what I learned:
oxo_1141000_3a_1

OXO is based on the concept of Universal Design. But what is Universal Design and how does it benefit users? In simplest terms, Universal Design means the design of products usable by as many people as possible. In the case of OXO, it means designing products for young and old, male and female, left- and right- handed and many with special needs.

Universal Design“? Cool. This phrase deftly encapsulates my dreams of master-minding a revolution from the perspective of legal technological innovation. Wish I had thought of it myself. Here’s one articulation of the Principles of universal design:

PRINCIPLE ONE: Equitable Use

PRINCIPLE TWO: Flexibility in Use

PRINCIPLE THREE: Simple and Intuitive Use

PRINCIPLE FOUR: Perceptible Information

PRINCIPLE FIVE: Tolerance for Error

PRINCIPLE SIX: Low Physical Effort

PRINCIPLE SEVEN: Size and Space for Approach and Use

BWAAAHHHAHHHAAA~!

Here’s what one path looks like:

Step One: learn something about the law. Check!
Step Two: impatience with the inanity of legal practice becomes intolerable. Check!
Step Three: understand that my highest use is NOT chasing clients. Check!
Step Four: Battle with middle-class demons for 15 years. Check!
Step Five: Say, what the hell! I gotta be meeeeee! Check.

So, when Ron Gruner asks, as he did in his White paper, “What’s Holding You back?” I can sum up the answer in 4 words: A Team To Execute

That’s all and that’s it.

Now, in solving that problem there is a major dependency, called “money.” But, money is not the first consideration when it comes to “getting stuff done.” The Amish don’t need money to build barns, they need people.
oxo_1064374_3a_1
The dirty little secret of capitalism is that there are a LOT of people who are not money-motivated. Most of them work long hours for other people. They’re not motivated by money,–they’re motivated by the things money can buy, their own limitations and their fears.

People aho are not money motivated are usually people with experience having lots of it. They know money doesn’t buy happiness and having “enough” money is relative to one’s perceived needs. That’s one of the lessons that allows our veterans to survive on the street,–the military taught them that survival isn’t about money, it’s about resourcefulness.

One thing I know is that there is a deep pool of people who have passion and who would gladly trade a Lamborghini and 100-hour work weeks for a satisfying 60-hour work week (let’s be realistic). Those are the people I need to find.

I find them, then I find someone with money to pay them to execute the visions.

That’s it and that’s all.

The imPERTinence of Law 2.0

Law 2.0 and it’s bratty little sister, 3.0 is a form of apostasy, the beginning and maturation of a fundamental shift in the way legal services are conceived, produced and consumed.

Law is a type of religion,–much like science. There is a technique to legal analysis that one learns in a “law seminary.” Catechismic competency is perfected state by state, with national definitional certification possible through tests such as the multi-state and professional ethics exams. The bar number is proof that one is ordained to wax on, wax off in arenas from the court room to Court TV.

There’s yet to be a law school that develops its curriculum with the idea of training lawyers to think differently about the law. Indeed, the only people who are allowed to think differently are professors, who are granted tenure based on the combination of connection and disconnection from orthodoxy. And, of course, their ability to pull alumni dollars. Such is the nature of accreditation.

However, the argument can be made that once one has spent 20 years outside of academia, its high time to start thinking about the way law could be improved. That’s why I’m obsessed with Law 2.0 / 3.0 technologies. There aren’t many of them, but we’re starting to see some really great stuff, alot of which is chronicled on Wired GC. I’ll try to open up a dialogue with the Vallex Fund, as well, since it states that it has an objective of investing in legal technological innovation:

The Vallex Fund will encourage entrepreneurs to consider the legal industry as a major, new opportunity. “Big Law” alone is a $100 billion industry whose clients are increasingly dissatisfied with the ever increasing costs of legal services. This is not a new trend. Over the last forty years many studies and surveys have been conducted chronicling the problems, inefficiencies and even abuses within the nation’s civil law system. Yet many, both inside and outside the legal profession, believe improvements have been far too slow in coming. A new approach is needed. As experienced entrepreneurs, the Vallex Fund’s management and investors believe an entrepreneurial approach can help expedite change and fresh thinking as it has so successfully done in other industries.

If law is a car, Law 2.0 deals with connection between dealers and manufacturers(efficiencies), dealers and consumers(reputation) and used car economy (mashups). It’s time for a bailout.

And, to get that bailout, Law 3.0 is here to find out what goes on under the hood,–and what should go on when you turn the key. It’s a revisit of the engine(Wankel, Diesel, Electric Hybrid). It’s a way of reclaiming the auto shop for the high-school fix-it guy. It’s not stopping to ask permision to rebuild a car from the spare parts and junk yards, rather than being forced to source parts from the manufacturer.

At the base of Law 3.0 is a soulful revisioning with respect to the concept of “car” from a semantic point of view, taking the car as an algorithym and figuring out which problems it solves and whether there are better ways to solve it.

In this revolutionary approach to manipulation of the rules and processes upon which law is premised, there is TREMENDOUS opportunity.

To get a handle on this, one way that the Obama Administration is poised to help us is with its Project Management approach to problem-solving. Few lawyers would argue that litigation management is as much about managing the process as about managing the content. So, it’s time for us to heal ourselves and let the geeks help us by creating analgies between the problems we face and solutions that already exist, so we can innovte rather than regurgitate.

We need to take another look at process, and figure out new ways to solve the problems faced by, e.g. The Rules of Civil/Criminal Procedure and Code of Military Justice. We need to ask whether they really work for us,–a collaborative discussion that will make itself clear as the Obama Administration works it’s way through the closing of Gitmo.

Rather than an economic approach (which I fear is the legacy of the Chicago School) to law, I urge The Deciders to consider, instead, a Project Managment approach. This means replacing the fundamental premise of scarcity and supply/demand curves with the analysis of problems and steps needed to reach a solution. This requires tools like PERT:

PERT was developed primarily to simplify the planning and scheduling of large and complex projects. It was able to incorporate uncertainty by making it possible to schedule a project while not knowing precisely the details and durations of all the activities. It is more of an event-oriented technique rather than start- and completion-oriented, and is used more in R&D-type projects where time, rather than cost, is the major factor. [ Wikipedia ]

PERT is brilliant stuff. And, it’s something that the techno-geeks around us understand fantastically well, because most are familiar with tools like agile development and scrum.

They.get.stuff.done.and.stuff.works.

We were to take some of those processes and appoaches, the refrain “Yes, We Can! after January 20th will be (with apologies to Montell Jordan ) “This Is How We Do It!”

Ambient Temperature Around a Problem

What a great quote!

Tadhg Ó Raghallaigh is on NPR’s Science Friday right now…no, I’m not twittering about it!…and he came out with the phrase (“Ambient Temperature Around a Problem”), as a way of describing how an agency such as the EPA can use social networking tools to gather intel to help them sort out issues and potential solutions.

Interesting that with the RIGHT group of followers, one can use Twitter to gather information as well as disseminate information quickly and without spending a lot of time. Transparently. And, without permission. Hmmm! Sounds like something out of the Anarchist’s Cookbook as much as the Cluetrain Manifesto. Indeed, on top of the mention of the NPR story, is a blog article about how terrorists might use Twitter. A duh.

I’ve spent a fair amount on animal books, and they truly are examples of excellence in publishing. I am so incredibly jealous of anyone with an O’Reilly shrine like this one:

180px-acm_oreilly-rainbow-large-flash

As a big fan of the O’Reilly empire…Tim’s continual redefinition and re-engineering of the meaning of Web 2.0 means that it will will eventually make sense as a business proposition. As currently defined, Web 2.0 is really nothing more than a way of talking about (and commercializing) resurgence of the web, but it’s important for lawyers to understand that that Web 2.0 companies leverage user information.

Why? Because lawyers tend not to ask for the opinions of others. We’re supposed to know everything. If we ask, it’s other lawyers, books and treatises. All very conservative.

But, what if lawyers start to open to the web? What if they built a channel on twitter,–open only to lawyers, that lawyers could use to ask about case theories in real-time. Chances are, lawyers would begin by using it to ask about the ruling history of cranky old judges, but eventually, they could use it to do case law shouts outs. To test theories amongst peers and to weigh in on drafts about legislation sure to be circultaed by the Obama administration.

This is why companies who just throw a few cute widgets on their websites are throwing away money. Gathering information from users without using it is NOT Web 2.0. For Web 2.0, you need to think deeply about what data means to you and how you can participate in the international feedback loops created by broad adoption of Internet standards.

With the right tweaking, other functions (such as real-time, cross-jurisdictional investigation of criminal defendants) aren’t far behind, so I look forward to the continued evolution of Web 2.0+ in the legal space.

What it might take, however, is for Twitter to publish a more conservative channel,–after all, only cool lawyers will admit to being twits.

Customer Relationship Management Software

Here’s an interesting list entitled ” 10 things every CRM software solution needs to include“:

  1. Familiar and user-friendly interface
  2. Tight integration with the Microsoft Office system and Office Outlook
  3. Reliable back-office integration (time and billing, etc.)
  4. True platform flexibility
  5. Current technology
  6. Quick and easy access to your data
  7. Customizable views and workspaces
  8. Powerful reporting and analysis tools
  9. Mobile access to data
  10. Quantifiable ROI and total cost of ownership

You can get the White Paper by CRM4Legal here. Honestly, I really am not sure how they re-concile dependence upon Microsoft products with some of these items.

In my book, a Law 2.0 office absolutely has to learn to leverage .doc, .html and .pdf formats. So long as those file types are supported, the job is 80% done. Nearly every other format can be converted into one of these three, so let’s look at why these are the key file types for web 2.0 communication.

What do Lawyers Do?

If lawyers were crows, we would stand around all day yakking. That’s because most of what we do is communicate. Some people think that our primary directive is to generate paper, but that’s not quite true: we generate paper, because that’s the way we communicate ideas for purposes of preserving those ideas.

So, if we generate paper to communicate ideas, it behooves us to first ask the question as to whether paper is required.

In court, the stenographer only generates paper when she needs to do so. In modern courts, the read back is done without printing,–it shows up on the computer screen in real-time. Further, more courts are learning towards providing the jury and judge with computer screens, so that very little documentary evidence need ever be printed out. Print outs these days are more for effect than for accurate communication of the ideas contained within.

What a huge revolutionary shift from the concerns of the Best Evidence Rule! Some might say that it’s really a smooth transition from the issue of photocopies and I would agree, but I remember when mimeograph copies were quite literally cranked out by hand because copying by xerox was far too expensive.

Office Communication

Heaven forbid we should ever stop talking to each other, but within the office, communication by collaborative application or email ought to be the rule for items that require a history. IM is fantastic, but it’s too easy to lose knowledge or transition an IM decision to documentation, unless someone has figured out a way to automatically print and tag an IM conversation for later retrieval. Does anyone know whether it has been done??

In my view, even phone messages should always be taken on-line adn placed in an attorney’s Tickle List. I have gone through as many phone message books as anyone, and it’s simply inefficient and messy given the tools now available. At the very least, look into somwthing quick and dirty like Caliente or PhoneSlips. The interfaces on these applications are amazingly AWFUL, but they look like they get the job done.

On the other hand, your son or daughter should be able to whip one up for you in about an hour using Access or PHP.  I wrote called “QuickMail” for a legal call center, which I am probably going to give away as open-sourced freeware.

Third-Party Written Communications

Website Operations