Executive Summary
Children’s books provide a portal to creativity and human development. However, even as the importance of positive role models for self-esteem becomes more apparent, many children do not see themselves represented in the stories they read. Using machine learning techniques and natural language processing, we can programmatically generate stories with characters and contexts to make all readers feel represented, empowering them to learn and grow.
We seek to raise $175,000 to create a beta version of the iOS application of our product. This funding will be applied to an application designer, consulting from a data scientist, a children’s linguistics expert, legal support, and access to data in the form of children’s literature. We intend to publish an app in the Apple App Store in 2019.
Our Challenge
Today, over 20% of US households spend more than 25% of their income on their children. Children’s books comprise a significant share of that budget. The proper development of a child’s reading ability starts at day one and has incredible effects on that child’s later ability to succeed in life. The US children’s book publishing industry is big business, to the tune of $166 million profit on $2.3 billion of revenue annually, with ebooks comprising 12% of the industry. Outside the home, annual US public school spending per student on supplies averages about $939 per student, totalling to approximately $47 billion.
However, opportunities remain. A 2018 study found that bias is often present in children’s books. Female characters are grossly underrepresented, or appear as the sidekick. The picture is even worse for people of color: they are represented in just 3% of books. And if your child has special needs, they are almost entirely missing from children’s books. It is important for children’s books to represent the diversity of the “real” world [6]. Exposure to diversity through children’s books can help “normalize” for children what may otherwise be perceived as different. It is also an opportunity for children to see role models in what they read.
Introducing fAIrytale
Children, parents and teachers no longer have to search for stories that represent them. fAIrytale is an app created with the mission to represent, empower and grow young readers and their communities through interactive storytelling. From personalized avatars to customizable storylines and dynamic creative, fAIrytale uses the power of augmented intelligence to make reader-centered stories a reality.
We envision programmatically populating stories using parameters filled out by a reader. Stories are co-created with the readers based on the child’s development level and selected descriptors. A reader would be able to choose characters’ gender, ethnicity, family structure, special needs, or more.
Initially, we will use a corpus of pre-written stories customized with character names, pronouns, and family dynamics that match those specified by the reader. Using machine learning techniques, fAIrytale fills out the selected story using the parameters selected by the reader. For instance, if a reader indicates that their parents are a same sex couple, fAIrytale will automatically populate the story with two moms and the appropriate gender pronouns. This will accomplish our minimum goal of aiding representation of groups that don’t usually get to see themselves as the heroes in the story.
Once parameters are selected, imagery that aligns with each sentence will populate the screen. This can be accomplished using dynamic creative, a technique already used in digital advertising. Dynamic creative optimization selects the best combination of elements to include in an ad based on data and real-time feedback. A similar technique could be used to create thousands of iterations of imagery that align with text from the story, for instance, penguins eating ice cream at a beach.
Pilot
Today, much of what fAIrytale can be is aspirational. We are confident that any one piece of this plan has the potential to be a successful business. Currently, we are asking for $175,000. With this money, we will work with a mobile application developer to help build out the system. We will also buy time with a data scientist to build out the machine learning system to populate the story, consulting from a linguist to understand what language is most appropriate for children, and legal services to protect any intellectual property. Currently, we have access to the text of one hundred contemporary children’s stories and many more public domain stories through project Gutenberg. Remaining funding will be used to license access to additional stories, thereby increasing the robustness of our data.
After interviewing several parents about our planned offering, we found strong support for our idea. One parent indicated they spend $500/year on books for their child. Another indicated his desire to share his Indian heritage with his child growing up in Chicago.
We envision our target segments to be both families and schools. Parents will have access to a freemium service. The free version will have more limited customization, and based on further customer research, may have a limited number of monthly stories. The full featured version of fAIrytale will allow for more robust customization and unlimited access for a monthly fee. Based on preliminary interviews with parents, we plan on charging $7.99 per month.
fAIrytale will also target school districts, to enable schools to further expand their libraries and engage students. Schools will also have access to the free model, or will be able to license access to the premium version.
Future Considerations
This is only the first challenge in our quest. Ultimately, our goal is AI-generated children’s storytelling.
We will tag our existing database for story, semantic structure and other characteristics. Humans will provide nuance, while algorithmic tagging and word embeddings such as word2vec will provide insight into parts of speech, and sentence structure. From there, we can automate portions of the story generation process. Because children’s stories are generally more simple, the task of algorithmically generating a story is less complex.
Even as the natural language processing capability develops, story curation will still be required. The AI will propose 2-3 suggested next sentences, and a reader will choose the next step. These interactions will also provide data on how readers engage with stories, allowing our algorithms to develop more favorable stories.
The primary candidates to enable this today are generative adversarial networks, long short-term memory networks (a technique for relational neural networks that has proven results for time delayed tasks, such as talking about a character that doesn’t appear in every sentence) and hidden Markov models (especially known for their application in reinforcement learning and temporal pattern recognition such as speech, handwriting, gesture recognition, and part-of-speech tagging). With a working model to generate simple stories combined with research on childhood language development, we can adjust stories to the reading skill of our readers and help them grow.
Beyond the technical, there are also various features and business opportunities we will explore as fAIrytale grows. This includes interactive images, so that children can touch the screen and get a response. Parent interviews revealed a desire to train the AI to mimic a parents’ voice, so that they can still “read” to their child when they are away. The app could enable multiple device logins, so that families can share stories, even at a distance.
Additional opportunities include creating merchandise around recurring characters, or printing favorite stories as physical books. In the classroom, we see additional opportunities to develop lesson plans and learning opportunities through stories.
Challenges & Risks
We strongly believe in this business model, but acknowledge the risks. While the pilot technology is certainly executable, the generation of new stories is a challenge. However, the development of algorithms such as word embedding, generative adversarial networks, and long short-term memory networks are making big strides in the area of natural language processing. By developing our application starting with our pilot, we can take advantage of this technology as it develops.
Additionally, in using existing stories to train generative systems, we risk codifying biases from past works. Word embedding techniques show that the word “girl” is more closely related to “homemaker” than “scientist”. By using these techniques, we will only perpetuate harmful biases. However, work on the subject has shown that it is possible to differentiate between bias and properly gendered words. I.e. “girl” should be considered closer to “waitress” than “waiter”. These techniques of debiasing should work equally well across other biased differences such as ethnicity.
Looking Forward
We believe that fAIrytale holds great promise. The technology to create a minimum viable product is proven, and the technology for many of our aspirational goals is fast improving. Given the extremely positive responses from interviews with parents and school teachers, and the importance of reading to child development, this project holds great potential not only in terms of economic value but in social value as well. We are excited to choose our adventure with fAIrytale, and we hope you select to join us.
CITATIONS:
- Rivera, Edward. “OD4394: Children’s Book Publishing Industry Report”. IBISWorld. Dec. 2017. Web. 5/28/2018.
- https://www.nytimes.com/2014/03/16/opinion/sunday/where-are-the-people-of-color-in-childrens-books.html?_r=2
- Speech, Language, and Swallowing – Development, American Speech-Language-Hearing Association, https://www.asha.org/public/speech/development/34/. Accessed 28 May 2018.
- School spending by student https://nces.ed.gov/programs/coe/indicator_cmb.asp
- Horning, Kathleen T. Publishing Statistics on Children’s Books about People of Color and First/Native Nations and by People of Color and First/Native Nations Authors and Illustrators, Cooperative Children’s Book Center School of Education, University of Wisconsin-Madison, 22 Feb. 2018, ccbc.education.wisc.edu/books/pcstats.asp. Accessed 29 May 2018.
- Epstein, BJ. Kids Need Diversity in Books to Prepare Them for the Real World, Newsweek, 7 Feb. 2017, www.newsweek.com/childrens-books-diversity-ethnicity-world-view-553654. Accessed 29 May 2018
- http://www.jmlr.org/papers/volume3/gers02a/gers02a.pdf