News

Finch for Text® Outperforms 14 Popular Text Analytics Solutions

Internal Assessment Reveals Superior Entity Extraction Accuracy across People, Organizations and Geographic Places

June 5, 2017

 

Reston, VA – Finch Computing, which helps customers solve their unstructured text challenges in real-time and at scale, today released the results of an internal assessment showing that its Finch for Text® analytics solution outperformed 14 popular text analytics solutions on the market today.

“We are incredibly pleased with these results, but not surprised,” Finch Computing Chief Technology Officer Scott Lightner said. “We’ve known for a long time that our IP and the advancements we’ve made in this domain could help customers use, understand and interact with their informational assets in entirely new ways. These results show that it is not only possible but critical to expect accuracy scores much, much higher than have typically been accepted.”

To perform its tests, Finch Computing assessed 14 of the text analytics solutions most often covered in the press, mentioned by customers or held-up as leading providers.

Utilizing the publicly available versions of these products, Finch Computing tested the solutions on an identical, 400-document corpus of news and social media content – precisely because it varies in topics, entities, length and more. This dataset is a perfect example of a streaming, human-generated content feed, much like the types of content that enterprises and governmental organizations need to understand every day – emails, research reports, message traffic, etc.

Specifically, the tests measured the solutions’ ability to extract named entities in unstructured text. Extraction refers to isolating entities in text and correctly determining their type. For example, well-performing entity extraction solutions can understand that the following statement is about an online retailer and not a South American river: “I do most of my shopping on Amazon.”

In this instance, text analytics solutions like Finch for Text® should correctly identify the entity “Amazon” and should also understand that “Amazon” is a company and not a geographic entity.

The measures of Precision and Recall are commonly used together to gauge the performance of text analytics solutions. Identifying “Amazon” is a measure of Recall – assessing whether the solution correctly determined that Amazon is a named entity in text. Understanding that in this instance “Amazon” is a company and not a river is a measure of Precision – assessing whether the solution correctly classified the entity’s type.

What follows are the precision and recall results of our examination of Finch for Text’s® performance as compared to other solutions. Results are expressed as Precision/Recall in percentages across PEOPLE, ORGANIZATIONS and GEOGRAPHIC PLACES. And the results are clear: Finch for Text® wins every head-to-head competition, across every entity type.

 

PEOPLE ORG GEO
Vendor 1 Them 76/51 75/59 60/76
Finch for Text® 90/90 91/76 84/89
Vendor 2 Them 94/68* 94/42** 78/62
Finch for Text® 90/90 91/76 84/89
Vendor 3 Them 71/57 50/37 53/68
Finch for Text® 90/90 91/76 84/89
Vendor 4 Them 78/9 85/21 76/60
Finch for Text® 90/90 91/76 84/89
Vendor 5 Them 87/82 79/49 75/76
Finch for Text® 90/90 91/76 84/89
Vendor 6 Them 88/41 75/48 69/68
Finch for Text® 90/90 91/76 84/89
Vendor 7 Them 32/90 54/70 42/83
Finch for Text® 90/90 91/76 84/89
Vendor 8 Them 49/58 72/34 70/57
Finch for Text® 90/90 91/76 84/89
Vendor 9 Them 63/59 69/40 70/66
Finch for Text® 90/90 91/76 84/89
Vendor 10 Them 80/45 70/47 55/64
Finch for Text® 90/90 91/76 84/89
Vendor 11 Them 81/51 61/40 43/75
Finch for Text® 90/90 91/76 84/89
Vendor 12 Them 56/77 78/60 74/70
Finch for Text® 90/90 91/76 84/89
Vendor 13 Them 77/66 64/63 59/75
Finch for Text® 90/90 91/76 84/89
Vendor 14 Them 78/78 55/62 46/84
Finch for Text® 90/90 91/76

84/89

* In Vendor 2’s test, on extractions of people mentioned in text, Finch for Text® outperformed on the measure of recall; meaning it more fully captured the entities in the document. Precision scores for Vendor 2 are better, but on a smaller, incomplete set of results.

** The same is true on organization extractions. Vendor 2’s precision scores are better. But its recall score is nearly 50% lower than the Finch for Text® score. Meaning its analysis is less complete.

“In many cases, we’re 30 and 40 points higher than these other solutions,” Lightner continued. “We think that speaks volumes about the need for organizations to expect more from their text analytics partners. More accuracy, more functionality, more results. That’s what we’re bringing to market with Finch for Text.”

Beyond superior entity extraction, Finch for Text® also offers entity disambiguation, entity enrichment and sentiment assignment. More than two dozen pieces of intellectual property make it unique and performant in a variety of business and mission-critical use cases.

To learn more about Finch Computing, Finch for Text® or this bake-off, please visit www.finchcomputing.com or contact sales@finchcomputing.com.

 

###