Rosette Name Indexer


Verify identities and match names and organizations against vast databases with industry leading accuracy and recall

Overview

Fuzzy name matching is hard

Names are vitally important data points in financial compliance, anti-fraud, government intelligence, law enforcement, and identity verification. Yet it can be challenging to match names when your data includes misspellings, aliases, nicknames, initials, names in different languages, and more.

Our name indexer solves these challenges with a linguistic, knowledge-based system that compares and fuzzy matches names of people, locations, and organizations. Built by linguistics experts, our name matching is unrivaled in its ability to connect entities with high adaptability, precision, and scalability.

Industry leading indexing model

Rosette blends machine learning with traditional name matching techniques such as name lists, common key, and rules to determine a match score. This score can be used to maximize precision or recall depending upon the application.

This highly adaptable model recognizes 13 different name phenomenon (see all 13 in the Tech Specs section), two examples:

Same name in multiple languages Mao ZedongМао Цзэдун泽东
Semantically similar names Eagle Pharmaceuticals, Inc. ↔ Eagle Drugs

Product Highlights

  • 15 supported languages
  • Matches names of people, locations, and organizations
  • Increases name search accuracy
  • Ranks results by relevancy with a similarity score
  • Intuitive cloud API
  • Customizable SDK
  • Fast and scalable
  • Industrial-strength support
  • Constantly stress-tested and improved

How It Works

The industry leader in names

Our name indexer uses machine learning rather than generated lists of name variations to perform fuzzy name matching. Our approach can match never-before-seen names. It also avoids the problem of an exponentially growing list. Even a three-element name (first, middle, last), with 12 variations for each element would add 12x12x12 = 1,728 variations to a list.

Unlike expensive and less accurate legacy solutions driven by thousands of spelling variants, our tools have a smaller footprint and analyze the intrinsic structure of each name component to perform an intelligent comparison using advanced linguistic algorithms. Under the hood, name indexer uses cutting-edge NLP techniques including neural networks, hidden Markov models, transliteration rules, and word embedding vectors.

Customizable to your needs

Our text analytics tools are unique in their adaptability. Our SDK or on-premise name indexer not only supports matching against vast data lakes, but can be tailored to fit your needs. You can, for example:

  • Set the minimum threshold of the similarity score to manage the precision and recall of search results
  • Create a list of “stopwords” to ignore when calculating matching scores (e.g., titles, honorifics).
  • Pre-set two names to always match with a given score (e.g., “Elizabeth” and “Lisbeth” always match at 90%)

We built our name indexing technology with large, complex databases in mind. Unlike other solutions that have been adapted to make them scalable, our name indexer was designed for customers with tens of millions of data entries, and use cases that cannot afford lags in performance and accuracy.

Tech Specs

Availability and Platform Support

Deployment availability:
Plugins:
Bindings:

Supported Languages

Arabic French Korean Russian
Chinese, Simplified German Pashto Spanish
Chinese, Traditional Italian Persian Urdu
English Japanese Portuguese

13 Ways Rosette Matches Names

Phonetic similarity JesusHeyzeusHaezoos
Transliteration spelling differences Abdul RasheedAbd al-Rashid
Nicknames WilliamWillBillBilly
Missing spaces or hyphens MaryEllenMary EllenMary-Ellen
Titles and honorifics Dr.Mr.Ph.D.
Truncated name components McDonaldsMcDonaldMcD
Missing name components Phillip Charles CarrPhillip Carr
Out-of-order name components Diaz, Carlos AlfonzoCarlos Alfonzo Diaz
Initials J. E. SmithJames Earl Smith
Names split inconsistently across database fields Dick. Van DykeDick Van . Dyke
Same name in multiple languages Mao ZedongМао Цзэдун泽东澤東
Semantically similar names Eagle Pharmaceuticals, Inc. ↔ Eagle Drugs, Co.
Semantically similar names across language Nippon Telegraph and Telephone Corporation ↔ 日本電信電話株式会社

Try the Demo


Cloud API

Easy to use API

Ideal for product evaluation, academic research, and smaller, cost-conscious businesses, our fast and powerful API is instantly accessible and free to get started.

Our matching endpoint supports only pairwise matching, generating a match score for any two names, locations, or organizations entered by the user. If you need to search for name matches against extensive databases of entities, talk to our customer engineering team about evaluating our on-premise name indexing.

Try name matching and the rest of Rosette API’s endpoints, free up to 10,000 calls/month!

Get an API Key

Quality documentation and support

Customers love our thorough and responsive support team. We also provide in-depth documentation that lists all the features and functions of the various API endpoints along-side examples in the binding of your choice.

Visit our GitHub for the binding and documentation.

Enterprise ready

Evaluate Rosette’s functional fit with your business and data needs on our cloud API knowing that scalable, customizable, on-premise deployments are available if you need them.

{
  "name1": {
    "text": "Влади́мир Влади́мирович Пу́тин",
    "language": "rus",
    "entityType": "PERSON"
  },
  "name2": {
    "text": "Vladimir Putin",
    "language": "eng",
    "entityType": "PERSON"
  }
}

{
  "result": {
    "score": 0.9486632809417912
  }
}

On Premise

Match against massive databases on premise

For organizations with vast data quantities, unique integration needs, and data security restrictions, we provide on-premise API deployment and SDKs to be hosted on your internal servers. Our on-premise name indexer allows you to search for matches against enormous databases. Our tools are built to support fast, accurate matching against tens of millions of entities.

Request product evaluation

If your organization requires an on-premise solution, we’re happy to work with you to meet your business’ unique needs. For free evaluation of our on-premise deployments please complete the form below and our Customer Engineering team will provide you with an on-premise evaluation package.

Drop us a line

EMAIL:
info@basistech.com

PHONE:
+1-617-386-2000

Select Customers Include

No coding required

rapidminer-1

rapidminer

RapidMiner is the industry’s #1 predictive analytics platform. The client platform, RapidMiner Studio, empowers organizations to easily prep data, create models and operationalize predictive analytics within any business process.

Try RapidMiner