How does Writer generate content?
Our content generation capabilities are powered by our family of large language models — specifically, deep learning language models trained to generate text. It’s trained on over 300 billion tokens of text data, and the size of the resulting model is over 20 billion parameters. From training, it can understand how language is stitched together and constructed, predicting what might come next given a question or input.
Where does the data for the model come from?
The 300 billion tokens come from open source and public domain text. This includes corpuses of text such as Common Crawl (an extremely large body of text crawled from the web and maintained by the NLP community), books, and a complete copy of Wikipedia.
How does customization and tuning work?
To customize the outputs, we rely on best-in-class examples from you. This can be just one to two examples, or thousands of examples. With examples, our model can better understand the context in which it’s writing content, making the output more in-line with your domain. Writer can also pick up on language and tone patterns, better matching your style.
What does Writer do with the custom training data I provide?
If you decide to train Writer with your custom data, that data will only ever be usable in your version of Writer. We do not share your training data or any AI apps you build with anyone.
Who owns the IP to content created with Writer?
You, the customer, own all intellectual property rights to content created using Writer.
Is the content generated by Writer good for SEO?
Google’s algorithms reward original, high-quality content. The content produced by Writer is completely original. Quality is, of course, subjective, but we recommend human team members review output before publishing. Review steps should include checking content structure, fact-checking, and incorporating original insights and information where relevant. Humans and search engine algorithms are unlikely to distinguish AI-generated content if these steps are followed. Read more about AI-generated content and SEO on our blog.
Writer seems to know some details about my company, even before training it on my own data. Where did that come from?
Since our model is trained on public domain text found on the web, the more your company has a presence on the web, the more Writer will know how to generate text based on publicly known company information. If Writer knows nothing or very little about a company, it will generalize based on what it’s seen about other similar companies.
Where does Writer get facts, quotes, and statistics from? How can I verify these as true?
Statistics and facts generated by Writer aren’t necessarily true. The model is suggesting that this is a statement that would fit well in the content, but it has no way to confirm that it’s definitely true. Any facts or statistics you see should be verified by a human editor.