Rather than focusing on trends and buzzwords (agents?), this post will examine practical ways to leverage AI for tangible improvements: extraction, classification, transformation, and generation. Of these, only generation is a net-new use case - the others have been staples of traditional machine learning and natural language processing for years. The major value LLMs add for these use cases is their ease of use. Since LLMs are pre-trained (that’s the P in GPT), they can be used out-of-the-box for many tasks, with no need for training phrases.

This could have been an email

Join my mailing list to get content like this directly in your inbox

    I won't spam you, promise. Unsubscribe at any time.

    A meticulous robot is meticulously sifting through a mountain of jumbled data. With focused precision, the robot extracts valuable information, carefully plucking out gems of data. The background is a chaotic mess of unorganized information. Style: New Yorker-esque illustration, highlighting detail and the sense of careful effort.

    1. Extraction: Pinpointing Relevant Information

    Extraction is making structured data from unstructured data. Traditional methods like named entity recognition (NER) have existed for some time, but LLMs have streamlined the process. LLMs excel at extracting data that has a wide range of possible valid values, like names and street addresses.

    One of my favorite extraction tricks with LLMs is to have them enrich the extracted data. For example, if you are extracting city names, LLMs can also return the country that the city is in.

    Benefits:

    • Increased Efficiency: The prompt only has to describe the data to extract, no need to extensive training phrases or labeled data.
    • Structured Data The LLM can return extracted data in a specific schema, in any data type.
    • Implicit Understanding Data can be enriched with the LLM’s built-in knowledge of the world.
    I’m looking for a hotel in Paris that has a pool and is close to the Louvre
    👨🏾‍🦲
    🤖
    Great! Just to confirm, you wish to find hotels in Paris, near the Louvre, that have pools. Is this correct?

    Behind the scenes, the chatbot could save the extracted data as:

    {
        "city": "Paris",
        "amenities": "Pool",
        "landmark": "Louvre",
    }
    

    A decisive robot is confidently stamping user messages with clear labels: "Helpful," "Unhelpful," "Complaint." The robot's expression is focused, and the messages are lined up neatly, ready for processing. Style: Hand-drawn illustration with a touch of satire, capturing the straightforward efficiency of automated decision-making.

    2. Classification: Organizing User Input

    Classification is the process of assigning labels or categories to user input. In the past, it was mostly used for include intent recognition or sentiment analysis. LLMs are still extremely capable at these tasks, but also have more nuance and can classify data by topic, category, or application, and can even assign multiple labels.

    Benefits:

    • Simplified Implementation: Prompts replace complex model training.
    • Versatile Applications: Suitable for sentiment analysis, topic categorization, and more.
    This product is amazing! It exceeded all my expectations.
    👨🏾‍🦲
    🤖
    I’m glad to hear you’re satisfied! Your feedback is valuable to us.

    In this case, the bot detected a positive sentiment and responded accordingly.

    {
        "user_sentiment":"positive"
    }
    

    Because LLMs are still not very good with numbers, I recommend sticking to labels or even a simple 1 - 5 scale for things like sentiment analysis or customer satisfaction (CSAT). Any numbers, like "user_sentiment":0.7 are liable to be made up or inconsistent. If numbers are necessary, however, you can have the LLM choose only between a positive and negative label, and then take the log probs for that token as a numeric value.

    A transformer robot is deftly converting a messy pile of disorganized data into a sleek, structured format (JSON, XML, etc.). The robot uses its arms toactively manipulates the data, using digital tools and displaying a look of intense concentration. Style: New Yorker-inspired illustration, emphasizing the seamless transformation and organization of information.

    3. Transformation: Adapting Data for Different Purposes

    Transformation involves converting data from one format to another. This can include translation, summarization, or reformatting information. LLMs have significantly simplified this process because they can parse nearly any form of text input, and, with the proliferation of JSON mode in foundation models, can consistently output to a standardized format.

    Benefits:

    • Reduced Complexity: Eliminates the need for complex rules, custom parsers and logic.
    • Multi-Lingual Support: Facilitates effortless text translation.
    How to say “How do I reset my password” in Spanish
    👨🏾‍🦲
    🤖
    ¿Cómo restablezco mi contraseña?)

    While it’s not necessarily a conversational use case, I’ve seen LLMs be incredibly useful for less technical users who need to transform or filter large amounts of data, be it from a spreadsheet or an API.

    A charismatic robot is dynamically composing original content for a chatbot, weaving together data points and user prompts. The robot's fingers fly across a digital keyboard, creating fresh and engaging text on the fly. It looks like that meme of the dog at the computer with the caption "I have no idea what I'm doing". Style: Lighthearted illustration with bold lines and vivid colors, capturing the excitement and creativity of AI-powered content creation.

    4. Generation: Creating New Content

    Generation is the creation of new content based on prompts or examples. This capability, primarily enabled by LLMs, allows for dynamic and personalized interactions. Retrieval augmented generation (RAG) is the most common use for generation. Some platforms, like Character AI or Chai, use chatbots that are entirely generative for their users.

    Benefits:

    • Dynamic, Personalized Experiences: Enables the creation of tailored greetings or sales pitches.
    • Knowledge-Driven Responses: Facilitates the generation of unique, relevant responses using data from a knowledge base.
    I’m looking for a new face moisturizer.
    👨🏾‍🦲
    🤖
    Based on your past purchases, you might like our Rejuvenating Night Cream. It’s formulated for dry skin and has received excellent reviews.

    Often it is difficult to describe recommendation heuristics using code. Luckily, experienced salespeople can simply describe their recommendation process to an LLM in the prompt, and the model is capable enough to take it from there.

    Implementation

    Of the four use cases, extraction and classification are the easiest and lowest-risk to implement today. They are, however, the lowest-value use cases and are quickly becoming table stakes for any AI application.

    Transformation and generation use cases are where the magic happens, but they need to be developed against a robust testing process and well-defined success criteria to ensure quality. This will add to development time, but also to product quality.

    Navigating the world of LLMs in conversational AI can be complex. If you’re looking to implement these strategies effectively and avoid common pitfalls, I can help. Reach out, and we’ll explore how to create amazing conversational experiences for your users.

    ❤️

    Gordy