Stories from a cluttered desk: 2025

When I began working with OpenAI's ChatGPT-4 to build a custom chatbot, I expected to be impressed. What I didn’t expect was to be drawn into a journey—one that would challenge my understanding of AI’s role in structured projects, and ultimately, redefine it as a collaborative partner. Here is an account of that experience, written, of course, with the aid of my new AI collaborator.

The Ask the Professor project has been a journey not just through technology, but through the emergence of a new kind of partnership between human expertise and artificial intelligence.[2] What began as a simple idea—to create a digital extension of my teaching and mentorship—quickly transformed into an exploration of how AI can not only answer questions but also support structured projects, execute real tasks, and contribute meaningfully to complex workflows.

Key Takeaways and Project Context

This journey was motivated by two driving forces: the need to evolve my persona chatbot from its MVP state to a more robust (and lower cost) platform, and adopting the challenges posed by Reid Hoffman in Impromptu and Ethan Mollick in his blog posts to engage AI in a more collaborative conversation.[3] More than that, it became an application of my long-held principle: the insights and truths of an idea (or application) come through the conversation.

More Than a Query Engine

When we think about Large Language Models (LLMs), it's easy to imagine them as glorified search engines: you ask a question, they retrieve an answer, albeit in a narrative style. But the Ask the Professor project quickly moved beyond that role.

To build the Ask the Professor project, I established a two-window strategy: one window was dedicated to my conversations with GPT-4, where we iteratively designed solutions, refined approaches, and mapped out next steps. The second window hosted the evolving chatbot, where those ideas were tested in real time. It was like having a project assistant that not only strategized with me but executed the plan, line by line, as we progressed.

Early in the project, it became clear that this GPT needed to do more than simply give me checklists for progressing the project.

At first, we iterated through structuring the chatbot to reference meeting and classroom transcripts, anonymize student names for privacy, and pose Socratic questions to engage deeper thinking.

But that was just the beginning.

Two-Window Setup: Building and Testing

This dual-window strategy became the backbone of development. In one window, GPT-4 acted as a project assistant — a partner in planning, strategy, and problem-solving. We refined anonymization processes, mapped student names to pseudonyms, and structured the chatbot's logic. In the second window, I applied those steps directly to the evolving Ask the Professor chatbot, testing, refining, and iterating in real-time.

This setup allowed us to move fluidly between ideation and implementation, shortening the cycle from concept to testing.

A New Kind of Collaboration

The turning point came when we ran into the challenge of anonymizing a large set of transcripts. Simple word swaps were not enough; we needed a systematic approach to detect, replace, and verify that each student name was consistently anonymized across different documents—without losing the thread of conversation.

At this point, the Ask the Professor project evolved from being a responsive assistant to something more agentic. Through Python scripting, the AI was able to generate code, execute it, and provide me with transformed files—all while keeping track of name mappings and cross-referencing them with anonymized lists.

This was more than a simple query-response mechanism. This was an aide-de-camp—a trusted assistant that could not only answer but act on information. It processed my Word files, mapped out all the speaker labels, replaced them with anonymized names, and returned them to me in a ready-to-use format. It even produced a translation table to keep track of these changes—no small feat for a conversational interface.

Moving Toward Agentic AI

What we accomplished here hints at something much larger: the progression from reactive LLMs to agentic AI.[4] The difference? A reactive LLM waits for input and responds; an agentic AI proactively engages in problem-solving, can be tasked with a goal, and execute multi-step operations independently, much like a research assistant would.

Moreover, GPT-4 did not simply assist; it guided decisions. It would suggest coding solutions, ask for approval, and execute on those ideas. This was not automation; this was orchestration. Together, we iterated—anonymizing names, transforming transcripts, and generating structured outputs—all within minutes. Tasks that once took days were now flowing seamlessly between ideation and execution, with GPT acting as both a collaborator and an executor.

A key difference from Agentic AI, was that in this conversation, the AI would suggest a programming step to take (an Agent task, if you will) but paused to ask what I thought and to approve proceeding. I could then modify the approach, and we’d go through another iteration in the conversation. When it took the programming steps, with my approval, it generated the code, executed it, generated modified files, and prompted me to download and review them. All of this took place on OpenAI’s server, not my local computer.

In short, The Ask the Professor project did not just assist; it extended my capability. It searched, extracted, transformed, and delivered results—an entire workflow that would have taken days to perform manually, condensed into iterative cycles of refinement and automation.

Achilles' Heel: The Limits of Agentic AI

For all its advances, the Ask the Professor project was not without its challenges. In pushing the boundaries of what agentic AI could do, we also exposed its limitations—limitations that became apparent not just in its performance, but in its dependencies on human oversight.

While the Ask the Professor project highlights impressive advancements in how LLMs can execute structured tasks, it also exposed critical limitations in the journey. However, many of these limitations were mitigated over time, again through careful iteration and structured refinements.

Dependency on Human Judgment: Each stage of transformation—anonymization, cross-referencing, and structured processing—still depended on my strategic intervention. GPT acted as a powerful tool, even suggesting ways forward, but not acting as an autonomous agent. It required my oversight to identify hallucinations, correct anonymization logic, and prompt more nuanced outputs.
Memory and Continuity Gaps: Despite repeated interactions, the model often failed to maintain name-role consistency or remember context across documents. Anonymization of student names was particularly challenging, but a solution was eventually developed through multiple iterations of name extraction, replacement logic, and systematic document updates. This iterative process not only resolved the inconsistencies but set the stage for a more robust anonymization workflow going forward.
Susceptibility to Hallucinations: During the early stages of the project, the assistant sometimes fabricated links or summarized content that did not exist in the files. This was corrected by refining the context instructions and limiting the model's scope strictly to the uploaded documents. As we honed its access to structured data, hallucinations dropped off significantly, demonstrating the importance of well-defined boundaries for LLM behavior.

These limitations suggest that while the Ask the Professor project pushed boundaries of GPT-driven task execution, it remains bounded by the need for human discernment and real-time correction.

Key Takeaways and Looking Ahead

LLMs are capable of more than reactive responses; with structured guidance, they can execute multi-step workflows.
However, their agentic qualities are still bounded by the need for oversight and strategic redirection.
True AI collaboration still relies heavily on human expertise to identify hallucinations, ensure continuity, and validate outcomes.

Looking forward, the Ask the Professor project will continue to evolve, with sharper integration of agentic capabilities, refined anonymization processes, and deeper continuity across conversation threads. The goal is not just a more capable assistant—but a true conversational partner that mirrors the rigor and reflection of real-world mentorship.

We are not just building a chatbot; we are architecting a digital aide-de-camp—an agent that learns, iterates, and collaborates in real time. The line between assistant and agent is beginning to blur, and the Ask the Professor project is at the frontier of that transformation. As we refine its agentic capabilities, I can’t help but wonder: if this is what’s possible now, what might the next iteration reveal?

[1] Note that this draft co-produced by iterating with ChatGPT4, based on two days of experience working on the Ask the Professor project. In addition to edits, I have added some footnotes. To review a log of the conversation with GPT-4 or the Python code it produced, send me a note.

[2] The Ask the Professor project is a work in progress that hopefully will be released on my Blog soon so all readers can test it. Meanwhile, you can get a flavor from the earlier MVP edition, here: https://eghapp.blogspot.com/2024/10/the-happgpt-professor-chatbot-test.html

[3] Reid Hoffman, “Impromptu: Amplifying Our Humanity Through AI,” Kindle Edition, 2023, https://www.amazon.com/Impromptu-Amplifying-Our-Humanity-Through-ebook/dp/B0BYG9V1RN/

and Ethan Mollick, “Co-Intelligence: Living and Working with AI,” Kindle Edition, 2024, https://www.amazon.com/Co-Intelligence-Living-Working-Ethan-Mollick-ebook/dp/B0CM8TRWK3/

Also see Ethan Mollick, “On Jagged AGI: o3, Gemini 2.5, and everything after,” Apr 20, 2025, https://www.oneusefulthing.org/p/on-jagged-agi-o3-gemini-25-and-everything . Watch the video!

[4] For a simple overview of Agentic AI, see IBM’s recent blog post by Cole Stryker, “What is agentic AI?”, https://www.ibm.com/think/topics/agentic-ai . For a more in-depth discussion see Edwin Lisowski, “AI Agents vs Agentic AI: What’s the Difference and Why Does It Matter?” Medium, Dec 18, 2024, https://medium.com/@elisowski/ai-agents-vs-agentic-ai-whats-the-difference-and-why-does-it-matter-03159ee8c2b4

Stories from a cluttered desk

Friday, May 9, 2025

From Queries to Agency: The Evolution of the Ask the Professor Project

Followers

Blog Archive

About Me