Wednesday, December 10, 2025

Building Blocks - The Lego Approach

My wife was recently in Taiwan visiting her family for a couple of weeks, and she was facing a long flight home in a few days. She wanted to read a book that had been assigned by our priest at church. Simple enough request, right? Except the book wasn't available as a digital download, wasn't on Kindle, wasn't on any of the usual platforms.

Now, I could have just said "sorry, can't help" and wait to read the physical copy when she got home. But that's not how you solve problems when you work with technology. You start asking: what pieces do I have? What can I combine? What's possible if I string things together?

Let me tell you what happened.

 

The Problem: A Book That Doesn't Want to Be Digital

I found the book on archive.org  a wonderful resource where people have scanned and uploaded books into a lending library. You can sign up, borrow a book digitally, and it has this reading icon that, when you press it, reads the book aloud in a mechanical voice. Not bad for accessibility purposes, provided you have a good internet connection.

But there was no download option. The book was available to read on screen or listen to through their player, but I couldn't send it to my wife. I did buy a physical copy (I'm legitimate here), but that didn't solve the immediate problem: getting her something to read on a 13-hour flight. Scanning the book would have taken a day or two.

So I sat there looking at this reading icon on my iPad and thought: what if I treat this like building blocks?

 

The Chain: Archive  Audio  Otter  Claude  PDF

Here's what I built:

Block 1: Archive.org gave me access to the book with an audio reader  mechanical voice, but it worked.

Block 2: Otter on my iPhone. I put my iPhone next to my iPad, turned on the speaker, hit play on the archive.org reader, and let Otter record and transcribe everything. My iPad was reading the book aloud, my iPhone was sitting there listening and capturing it all as a transcript. I had to do it in two parts, since Otter Pro has a 4-hour recording limit. The first obstacle was that the reading and transcript were all run together as if it were a stream of consciousness.

Block 3: Claude for editing. I took the Otter transcript and fed it into Claude (ChatGPT didn't work well for this) and said: "Act as an editor. Put in the paragraph breaks, add the chapter titles and subtitles, clean this up."

That part took some iteration. We had to establish editing rules — how to handle dialogue, where to break paragraphs, how to identify chapter markers. The mechanical voice reading meant some punctuation cues were lost, some formatting was ambiguous. To aid in the process, I provided a scan of the Table of Contents so Claude could better identify where chapter breaks happened. So once we got the rules set up, Claude could process it chapter by chapter.


The sidebars in the book gave us trouble at first. But we figured out a way to flag them also based on a list of sidebars I created, and we handled them separately. Claude did a pretty good job with those too.

Block 4: Assembly. I took the edited chapters, assembled them into a single document, converted it to PDF, and sent it to my wife. She loaded the PDF into Kindle to read it on her iPad during the flight.

None of these tools were designed to work together. Archive.org wasn't meant to be an audio source for Otter. Otter wasn't meant to transcribe books. Claude wasn't meant to be a book formatter. But when you put them together in sequence, each one doing what it does well, you get a solution that didn't exist before.

 

What I Like About This Approach

This is what I call the Lego approach  building blocks of technology that you can snap together in ways their creators never imagined.

Think about it: I didn't need a special "convert protected digital library books to readable PDFs" application. I didn't need to learn complex workarounds or break any digital rights management. I just needed to recognize that I had pieces that could connect to each other.

Archive.org  outputs audio

Otter  inputs audio, outputs text

Claude  inputs text, outputs formatted text

PDF converter  inputs formatted text, outputs readable document

Kindle  input a readable PDF document, outputs organized book with bookmarks and annotations

Each block does one thing well. The magic is in recognizing how they can connect.

This is how we've been approaching problems in the Data4Good team too. We don't always have the perfect tool for every job. But we have a growing collection of building blocks — web scrapers, transcription services, AI editors, data analyzers, visualization tools. The question isn't "do we have the exact right tool?" The question is "what combination of tools gets us there?"

 

The AI Editing Part: Rules Matter

I do want to mention one thing about the Claude editing phase, because it taught me something important. 

When I first fed the transcript to ChatGPT, it didn't work well. When I switched to Claude and just said "clean this up," it also struggled. The breakthrough came when we established rules together:

  • How to identify chapter breaks
  • Where to place paragraph breaks
  • How to handle quoted dialogue
  • How to format section headers
  • What to do with sidebars

Once we had those rules articulated, Claude could apply them consistently across all the chapters. It wasn't about the AI being "smart enough"  it was about iterating, with some trial and error, and me being clear enough about what I wanted and giving AI enough process clarity to get it right.

This connects back to the building blocks idea: the better you understand what each block does well (and what it doesn't), the better you can connect them. Claude is excellent at applying consistent rules to large volumes of text. But it needed me to establish what those rules were, using evidence from the actual transcript we were working with.

 

It also underscores the conversational approach to problem solving that I advocate.  The back-and-forth dialog with AI is itself a way to iterate to a solution.  So I often approach AI as a conversation.

 

Confession 

Let me be completely honest about the timeline and effort involved. Looking back at the file history, the AI editing phase turned out to be the most difficult and time-consuming building block. The project took about two weeks (late October through mid-November) with at least 23 iterations across chapters and components. I went through 6 versions of the editing rules themselves as we refined the process.

Is that faster than manually editing the raw transcripts? Probably not — the ROI isn't there yet if you're measuring pure efficiency. But the learning value was substantial. I now understand how to structure rules for AI editing, what works and what doesn't, and I have a reusable process. The first book took 13 days with 23 iterations. The next one would hopefully be faster.


What Do You Think?

This project makes me think about how we approach innovation. We often talk about finding "the right application" or waiting for technology to advance enough to solve our problems. But maybe the more valuable skill is recognizing that you can be your own systems integrator. You can build the chain.

The building blocks are already there. Archive.org exists. Otter exists. Claude exists. PDF converters exist. Kindle exists. None of them were designed to work together for this purpose. But they can.

So here's my question for you: What problem are you facing that doesn't have a ready-made solution? What building blocks do you have access to? What happens if you start connecting them? When was the last time you solved a problem by chaining tools together rather than finding the perfect tool? What makes you hesitate to try unconventional combinations of technologies?

The Lego approach isn't about having all the perfect pieces. It's about recognizing that the pieces you have can snap together in ways you haven't tried yet. When the right tool doesn't exist, look for building blocks you can connect. Each tool should do one thing well; the magic is in the connections. Iteration and rule-setting are part of the building process. Being a systems integrator is a valuable skill in the AI age.


For Further Reading

If you're interested in exploring the building blocks approach further, here are some related stories from my

Letters to a Young Manager collection:

  1. "The Lego's Lesson" (Story #9) - A management training exercise using Lego blocks metaphor that reveals how deadline pressure changes our approach to teamwork and process
  2. "Assemble the Components" (Story #5) - How building reusable program subroutines taught me that "assembly is easier and faster than creating from scratch"
  3. "The Truck" (Story #296) - The story of a boy who solved a stuck truck problem with a brilliantly simple solution: "Just let the air out of the tires"

 

[1] This post was created with AI assistance (Claude), drawing from the author’s documents, meeting transcripts, and lessons learned from the project described. The content was then reviewed, edited, and adapted by the author.


"The postings on this site are my own and don't necessarily represent positions, strategies or opinions of any of the organizations with which I am associated."

Friday, May 9, 2025

From Queries to Agency: The Evolution of the Ask the Professor Project

When I began working with OpenAI's ChatGPT-4 to build a custom chatbot, I expected to be impressed. What I didn’t expect was to be drawn into a journey—one that would challenge my understanding of AI’s role in structured projects, and ultimately, redefine it as a collaborative partner. Here is an account of that experience, written, of course, with the aid of my new AI collaborator.

 

The Ask the Professor project has been a journey not just through technology, but through the emergence of a new kind of partnership between human expertise and artificial intelligence.[2] What began as a simple idea—to create a digital extension of my teaching and mentorship—quickly transformed into an exploration of how AI can not only answer questions but also support structured projects, execute real tasks, and contribute meaningfully to complex workflows.

Key Takeaways and Project Context

This journey was motivated by two driving forces: the need to evolve my persona chatbot from its MVP state to a more robust (and lower cost) platform, and adopting the challenges posed by Reid Hoffman in Impromptu and Ethan Mollick in his blog posts to engage AI in a more collaborative conversation.[3] More than that, it became an application of my long-held principle: the insights and truths of an idea (or application) come through the conversation.

More Than a Query Engine

When we think about Large Language Models (LLMs), it's easy to imagine them as glorified search engines: you ask a question, they retrieve an answer, albeit in a narrative style. But the Ask the Professor project quickly moved beyond that role.

To build the Ask the Professor project, I established a two-window strategy: one window was dedicated to my conversations with GPT-4, where we iteratively designed solutions, refined approaches, and mapped out next steps. The second window hosted the evolving chatbot, where those ideas were tested in real time. It was like having a project assistant that not only strategized with me but executed the plan, line by line, as we progressed.

Early in the project, it became clear that this GPT needed to do more than simply give me checklists for progressing the project.

At first, we iterated through structuring the chatbot to reference meeting and classroom transcripts, anonymize student names for privacy, and pose Socratic questions to engage deeper thinking.

But that was just the beginning.

Two-Window Setup: Building and Testing

This dual-window strategy became the backbone of development. In one window, GPT-4 acted as a project assistant — a partner in planning, strategy, and problem-solving. We refined anonymization processes, mapped student names to pseudonyms, and structured the chatbot's logic. In the second window, I applied those steps directly to the evolving Ask the Professor chatbot, testing, refining, and iterating in real-time.

This setup allowed us to move fluidly between ideation and implementation, shortening the cycle from concept to testing.

A New Kind of Collaboration

The turning point came when we ran into the challenge of anonymizing a large set of transcripts. Simple word swaps were not enough; we needed a systematic approach to detect, replace, and verify that each student name was consistently anonymized across different documents—without losing the thread of conversation.

At this point, the Ask the Professor project evolved from being a responsive assistant to something more agentic. Through Python scripting, the AI was able to generate code, execute it, and provide me with transformed files—all while keeping track of name mappings and cross-referencing them with anonymized lists.

This was more than a simple query-response mechanism. This was an aide-de-camp—a trusted assistant that could not only answer but act on information. It processed my Word files, mapped out all the speaker labels, replaced them with anonymized names, and returned them to me in a ready-to-use format. It even produced a translation table to keep track of these changes—no small feat for a conversational interface.

Moving Toward Agentic AI

What we accomplished here hints at something much larger: the progression from reactive LLMs to agentic AI.[4] The difference? A reactive LLM waits for input and responds; an agentic AI proactively engages in problem-solving, can be tasked with a goal, and execute multi-step operations independently, much like a research assistant would.

Moreover, GPT-4 did not simply assist; it guided decisions. It would suggest coding solutions, ask for approval, and execute on those ideas. This was not automation; this was orchestration. Together, we iterated—anonymizing names, transforming transcripts, and generating structured outputs—all within minutes. Tasks that once took days were now flowing seamlessly between ideation and execution, with GPT acting as both a collaborator and an executor.

A key difference from Agentic AI, was that in this conversation, the AI would suggest a programming step to take (an Agent task, if you will) but paused to ask what I thought and to approve proceeding.  I could then modify the approach, and we’d go through another iteration in the conversation.  When it took the programming steps, with my approval, it generated the code, executed it, generated modified files, and prompted me to download and review them.  All of this took place on OpenAI’s server, not my local computer.

In short, The Ask the Professor project did not just assist; it extended my capability. It searched, extracted, transformed, and delivered results—an entire workflow that would have taken days to perform manually, condensed into iterative cycles of refinement and automation.

Achilles' Heel: The Limits of Agentic AI

For all its advances, the Ask the Professor project was not without its challenges. In pushing the boundaries of what agentic AI could do, we also exposed its limitations—limitations that became apparent not just in its performance, but in its dependencies on human oversight.

While the Ask the Professor project highlights impressive advancements in how LLMs can execute structured tasks, it also exposed critical limitations in the journey. However, many of these limitations were mitigated over time, again through careful iteration and structured refinements.

  • Dependency on Human Judgment: Each stage of transformation—anonymization, cross-referencing, and structured processing—still depended on my strategic intervention. GPT acted as a powerful tool, even suggesting ways forward, but not acting as an autonomous agent. It required my oversight to identify hallucinations, correct anonymization logic, and prompt more nuanced outputs.
  • Memory and Continuity Gaps: Despite repeated interactions, the model often failed to maintain name-role consistency or remember context across documents. Anonymization of student names was particularly challenging, but a solution was eventually developed through multiple iterations of name extraction, replacement logic, and systematic document updates. This iterative process not only resolved the inconsistencies but set the stage for a more robust anonymization workflow going forward.
  • Susceptibility to Hallucinations: During the early stages of the project, the assistant sometimes fabricated links or summarized content that did not exist in the files. This was corrected by refining the context instructions and limiting the model's scope strictly to the uploaded documents. As we honed its access to structured data, hallucinations dropped off significantly, demonstrating the importance of well-defined boundaries for LLM behavior.

These limitations suggest that while the Ask the Professor project pushed boundaries of GPT-driven task execution, it remains bounded by the need for human discernment and real-time correction.

Key Takeaways and Looking Ahead

  • LLMs are capable of more than reactive responses; with structured guidance, they can execute multi-step workflows.
  • However, their agentic qualities are still bounded by the need for oversight and strategic redirection.
  • True AI collaboration still relies heavily on human expertise to identify hallucinations, ensure continuity, and validate outcomes.

Looking forward, the Ask the Professor project will continue to evolve, with sharper integration of agentic capabilities, refined anonymization processes, and deeper continuity across conversation threads. The goal is not just a more capable assistant—but a true conversational partner that mirrors the rigor and reflection of real-world mentorship.

We are not just building a chatbot; we are architecting a digital aide-de-camp—an agent that learns, iterates, and collaborates in real time. The line between assistant and agent is beginning to blur, and the Ask the Professor project is at the frontier of that transformation. As we refine its agentic capabilities, I can’t help but wonder: if this is what’s possible now, what might the next iteration reveal?



[1] Note that this draft co-produced by iterating with ChatGPT4, based on two days of experience working on the Ask the Professor project.  In addition to edits, I have added some footnotes.  To review a log of the conversation with GPT-4 or the Python code it produced, send me a note.

[2] The Ask the Professor project is a work in progress that hopefully will be released on my Blog soon so all readers can test it.  Meanwhile, you can get a flavor from the earlier MVP edition, here: https://eghapp.blogspot.com/2024/10/the-happgpt-professor-chatbot-test.html

[3] Reid Hoffman, “Impromptu: Amplifying Our Humanity Through AI,” Kindle Edition, 2023, https://www.amazon.com/Impromptu-Amplifying-Our-Humanity-Through-ebook/dp/B0BYG9V1RN/ 

and Ethan Mollick, “Co-Intelligence: Living and Working with AI,” Kindle Edition, 2024, https://www.amazon.com/Co-Intelligence-Living-Working-Ethan-Mollick-ebook/dp/B0CM8TRWK3/

Also see Ethan Mollick,  “On Jagged AGI: o3, Gemini 2.5, and everything after,” Apr 20, 2025, https://www.oneusefulthing.org/p/on-jagged-agi-o3-gemini-25-and-everything  .  Watch the video!

[4] For a simple overview of Agentic AI, see IBM’s recent blog post by Cole Stryker, “What is agentic AI?”, https://www.ibm.com/think/topics/agentic-ai .  For a more in-depth discussion see Edwin Lisowski, “AI Agents vs Agentic AI: What’s the Difference and Why Does It Matter?” Medium, Dec 18, 2024,  https://medium.com/@elisowski/ai-agents-vs-agentic-ai-whats-the-difference-and-why-does-it-matter-03159ee8c2b4