Lecture 1: Introduction
What’s NLG
• NLG systems are computer algorithms/systems which produce texts in
English or other human languages
• Input is data (raw or analyzed)
⁃ often text, NLG usually does not include MT
• Output is text:
⁃ sentences, reports, explanations, etc.
• Two aims:
⁃ Understanding language production (Theoretical NLG)
⁃ Building practically useful systems (Practical NLG)
Language technology
• From data to meaning: speech —> speech recognition —> NLU —> meaning
• From meaning to data: meaning —> NLG —> text —> speech synthesis —>
speech
Ex. 1: Weather forecast
• Input: numerical weather predictions
⁃ From supercomputer running a numerical weather simulation
• Output: textual weather forecast
⁃ Users often prefer some NLG texts over human texts
⁃ More consistent, better word choice
Ex. 2: Road maintenance
• Forecasts for gritting and other winter road maintenance procedures
• Input is 15 parameters over space and time
⁃ Temperature, wind speed, rain, etc
⁃ Over thousands of points on a grid
⁃ Over 24 hours (20-min interval)
• Generated text for each of these
• Issues:
⁃ Weather terms can be context dependent
⁃ Light rain in Ireland vs light rain in the Sahara
⁃ Aggregating over a huge set of locations
⁃ Being brief yet truthful and informative
⁃ The risk of false negatives
Ex. 3: BabyTalk
• Goal: summarize clinical data about premature babies in neonatal ICU
• Input: sensor data (blood pressure, heart rate); records of actions/
observations by medical staff
• Output: multi-paramedic texts, summarise
⁃ BT45: 45 mins data, for doctors
⁃ BT-Nurse: 12 hrs data, for nurses
⁃ BT-Family: 24 hrs data, for parents
, • Issues here:
⁃ How to decide on evaluative terms like “stable”
⁃ How to avoid omitting clinically relevant info
⁃ How to generate a coherent narrative
⁃ How be be clear about the time line
Ex. 4: ScubaText system
• Demo system for scuba divers
• Input is dive computer data
⁃ Depth-time profile of scuba dive
• Output is feedback to diver
⁃ Mistakes, what to do better next time
⁃ Encouragement of things done well
Other NLG apps
• Automatic journalism
• Reporting on sports results
• Textual feedback on health
• Agents and dialogue systems
• Financial reporting for companies
• Image labelling
NLG systems’ pipeline
• Data analytics and interpretation:
⁃ Making sense of the data
• Document planning:
⁃ Decide on content and structure of text
⁃ Content selection:
⁃ Of all the things I could inform you about, which should be
chosen?
⁃ Depends on what is important, what is easy to say, what makes
good narrative
⁃ Document structure:
⁃ How should I organize this content as a text?
⁃ What order do I say things in?
⁃ What rethorical structure?
• Microplanning:
⁃ Decide how to linguistically express text (which words, sentences, etc.
to use; how to identify objects, actions, times)
⁃ Lexical/syntactic choice:
⁃ Which words and linguistic structures to use?
⁃ Aggregation:
⁃ How should information be distributed across sentences and
paragraphs?
⁃ Reference:
⁃ How should the text refer to objects and entities?
• Linguistic Realization:
⁃ Grammatical details:
⁃ Form “legal” English sentences based on decisions made in