Announcing FitBERT: Open source NLP library for engineers and researchers

    Announcing FitBERT: Open Source NLP Library for Engineers and Researchers

    This just in: we’re open sourcing FitBERT, a library to make it easy for anyone who knows Python to use BERT (or other fancy deep learning NLP models) for smarter string interpolation.

    Why did we do this? Most people who write code don’t know how to use deep learning frameworks, or any machine learning at all. Even if you do know how to use these frameworks, sometimes you want to quickly interact with BERT. You may even be willing to give up some flexibility in exchange for ease of use. It’s for these cases that we wrote FitBERT.

    What does FitBERT mean?

    FitBERT stands for “Fill in the Blanks, BERT”. BERT is Google’s very large masked language model. It was released in 2018 and it shattered benchmarks in natural language processing tasks such as sentiment analysis and question answering.

    So, we named our project FiTBERT because it’s very good at filling in the blanks in sentences. It does that using what I’m calling “smart string interpolation”.

    What is smart string interpolation?

    If string interpolation is a new term for you, here’s a quick explanation to help you understand it.

    In engineering, a computer writes “Hello [your name]” using this code: print(f”hello {name}”). The act of evaluating that placeholder and filling in the blank correctly is string interpolation.

    In standard string interpolation, a specific value is provided to fill in the blank. This is, for example, how marketers send us emails that use our name. Our names are in a database, and that database is used to determine exactly which name to put into the placeholder.

    Smart string interpolation takes that to the next level by allowing engineers to share a list of possible values, and then asks the computer to select the best choice.

    So say you have code like: print(f”The [food item] is [adjective]!”). The food item is filled in using normal string interpolation; a database will decide that [food item] should be replaced with “sandwich” or “burrito” or “salad”. Then, smart string interpolation enables the computer to decide which adjective makes the most sense for the food item selected.

    If you’re interested in learning more about how to use our FitBERT open source project, read the full article on Medium.

    FitBERT in research

    FitBERT could also be used in serious research! FitBERT is based on Yoav Goldberg’s Assessing BERT’s Syntactic Abilities. The paper assesses “the extent to which the recently introduced BERT model captures English syntactic phenomena.” See the full FitBERT article for code samples on how to reproduce or extend these experiments.

    Interested in what we’re working on next in NLP and machine learning? Message our team at