When I began working with OpenAI's ChatGPT-4 to build a custom chatbot, I expected to be impressed. What I didn’t expect was to be drawn into a journey—one that would challenge my understanding of AI’s role in structured projects, and ultimately, redefine it as a collaborative partner. Here is an account of that experience, written, of course, with the aid of my new AI collaborator.
The Ask the Professor project has been a journey not
just through technology, but through the emergence of a new kind of partnership
between human expertise and artificial intelligence.[2]
What began as a simple idea—to create a digital extension of my teaching and
mentorship—quickly transformed into an exploration of how AI can not only
answer questions but also support structured projects, execute real tasks, and
contribute meaningfully to complex workflows.
Key Takeaways and Project Context
This journey was motivated by two driving forces: the need
to evolve my persona
chatbot from its MVP state to a more robust (and lower cost) platform, and
adopting the challenges posed by Reid Hoffman in Impromptu and Ethan
Mollick in his blog posts to engage AI in a more collaborative conversation.[3]
More than that, it became an application of my long-held principle: the
insights and truths of an idea (or application) come through the conversation.
More Than a Query Engine
When we think about Large Language Models (LLMs), it's easy
to imagine them as glorified search engines: you ask a question, they retrieve
an answer, albeit in a narrative style. But the Ask the Professor
project quickly moved beyond that role.
To build the Ask the Professor project, I established a
two-window strategy: one window was dedicated to my conversations with GPT-4,
where we iteratively designed solutions, refined approaches, and mapped out
next steps. The second window hosted the evolving chatbot, where those ideas
were tested in real time. It was like having a project assistant that not only
strategized with me but executed the plan, line by line, as we progressed.
Early in the project, it became clear that this GPT needed
to do more than simply give me checklists for progressing the project.
At first, we iterated through structuring the chatbot to
reference meeting and classroom transcripts, anonymize student names for
privacy, and pose Socratic questions to engage deeper thinking.
But that was just the beginning.
Two-Window Setup: Building and Testing
This dual-window strategy became the backbone of
development. In one window, GPT-4 acted as a project assistant — a partner in
planning, strategy, and problem-solving. We refined anonymization processes,
mapped student names to pseudonyms, and structured the chatbot's logic. In the
second window, I applied those steps directly to the evolving Ask the
Professor chatbot, testing, refining, and iterating in real-time.
This setup allowed us to move fluidly between ideation and
implementation, shortening the cycle from concept to testing.
A New Kind of Collaboration
The turning point came when we ran into the challenge of
anonymizing a large set of transcripts. Simple word swaps were not enough; we
needed a systematic approach to detect, replace, and verify that each student
name was consistently anonymized across different documents—without losing the
thread of conversation.
At this point, the Ask the Professor project evolved
from being a responsive assistant to something more agentic. Through Python
scripting, the AI was able to generate code, execute it, and provide me with
transformed files—all while keeping track of name mappings and
cross-referencing them with anonymized lists.
This was more than a simple query-response mechanism. This
was an aide-de-camp—a trusted assistant that could not only answer but act on
information. It processed my Word files, mapped out all the speaker labels,
replaced them with anonymized names, and returned them to me in a ready-to-use
format. It even produced a translation table to keep track of these changes—no
small feat for a conversational interface.
Moving Toward Agentic AI
What we accomplished here hints at something much larger:
the progression from reactive LLMs to agentic AI.[4]
The difference? A reactive LLM waits for input and responds; an agentic AI
proactively engages in problem-solving, can be tasked with a goal, and execute
multi-step operations independently, much like a research assistant would.
Moreover, GPT-4 did not simply assist; it guided decisions.
It would suggest coding solutions, ask for approval, and execute on those
ideas. This was not automation; this was orchestration. Together, we
iterated—anonymizing names, transforming transcripts, and generating structured
outputs—all within minutes. Tasks that once took days were now flowing
seamlessly between ideation and execution, with GPT acting as both a
collaborator and an executor.
A key difference from Agentic AI, was that in this
conversation, the AI would suggest a programming step to take (an Agent task,
if you will) but paused to ask what I thought and to approve proceeding. I could then modify the approach, and we’d go
through another iteration in the conversation.
When it took the programming steps, with my approval, it generated the
code, executed it, generated modified files, and prompted me to download and
review them. All of this took place on
OpenAI’s server, not my local computer.
In short, The Ask the Professor project did not just
assist; it extended my capability. It searched, extracted, transformed, and
delivered results—an entire workflow that would have taken days to perform
manually, condensed into iterative cycles of refinement and automation.
Achilles' Heel: The Limits of Agentic AI
For all its advances, the Ask the Professor project
was not without its challenges. In pushing the boundaries of what agentic AI
could do, we also exposed its limitations—limitations that became apparent not
just in its performance, but in its dependencies on human oversight.
While the Ask the Professor project highlights
impressive advancements in how LLMs can execute structured tasks, it also
exposed critical limitations in the journey. However, many of these limitations
were mitigated over time, again through careful iteration and structured
refinements.
- Dependency
on Human Judgment: Each stage of transformation—anonymization,
cross-referencing, and structured processing—still depended on my
strategic intervention. GPT acted as a powerful tool, even suggesting ways
forward, but not acting as an autonomous agent. It required my oversight
to identify hallucinations, correct anonymization logic, and prompt more
nuanced outputs.
- Memory
and Continuity Gaps: Despite repeated interactions, the model often
failed to maintain name-role consistency or remember context across
documents. Anonymization of student names was particularly challenging,
but a solution was eventually developed through multiple iterations of
name extraction, replacement logic, and systematic document updates. This
iterative process not only resolved the inconsistencies but set the stage
for a more robust anonymization workflow going forward.
- Susceptibility
to Hallucinations: During the early stages of the project, the
assistant sometimes fabricated links or summarized content that did not
exist in the files. This was corrected by refining the context
instructions and limiting the model's scope strictly to the uploaded documents.
As we honed its access to structured data, hallucinations dropped off
significantly, demonstrating the importance of well-defined boundaries for
LLM behavior.
These limitations suggest that while the Ask the
Professor project pushed boundaries of GPT-driven task execution, it
remains bounded by the need for human discernment and real-time correction.
Key Takeaways and Looking Ahead
- LLMs
are capable of more than reactive responses; with structured guidance,
they can execute multi-step workflows.
- However,
their agentic qualities are still bounded by the need for oversight and
strategic redirection.
- True
AI collaboration still relies heavily on human expertise to identify
hallucinations, ensure continuity, and validate outcomes.
Looking forward, the Ask the Professor project will
continue to evolve, with sharper integration of agentic capabilities, refined
anonymization processes, and deeper continuity across conversation threads. The
goal is not just a more capable assistant—but a true conversational partner
that mirrors the rigor and reflection of real-world mentorship.
We are not just building a chatbot; we are architecting a
digital aide-de-camp—an agent that learns, iterates, and collaborates in
real time. The line between assistant and agent is beginning to blur, and the Ask
the Professor project is at the frontier of that transformation. As we
refine its agentic capabilities, I can’t help but wonder: if this is what’s
possible now, what might the next iteration reveal?
[1] Note
that this draft co-produced by iterating with ChatGPT4, based on two days of
experience working on the Ask the Professor project. In addition to edits, I have added some
footnotes. To review a log of the
conversation with GPT-4 or the Python code it produced, send me a note.
[2]
The Ask the Professor project is a work in progress that hopefully will
be released on my Blog soon so all readers can test it. Meanwhile, you can get a flavor from the
earlier MVP edition, here: https://eghapp.blogspot.com/2024/10/the-happgpt-professor-chatbot-test.html
[3]
Reid Hoffman, “Impromptu: Amplifying Our Humanity Through AI,” Kindle Edition,
2023, https://www.amazon.com/Impromptu-Amplifying-Our-Humanity-Through-ebook/dp/B0BYG9V1RN/
and Ethan Mollick, “Co-Intelligence: Living and
Working with AI,” Kindle Edition, 2024, https://www.amazon.com/Co-Intelligence-Living-Working-Ethan-Mollick-ebook/dp/B0CM8TRWK3/
Also see Ethan Mollick, “On Jagged AGI: o3, Gemini 2.5, and
everything after,” Apr 20, 2025, https://www.oneusefulthing.org/p/on-jagged-agi-o3-gemini-25-and-everything
.
Watch the video!
[4] For a simple overview of Agentic AI, see IBM’s recent blog post by Cole Stryker, “What is agentic AI?”, https://www.ibm.com/think/topics/agentic-ai . For a more in-depth discussion see Edwin Lisowski, “AI Agents vs Agentic AI: What’s the Difference and Why Does It Matter?” Medium, Dec 18, 2024, https://medium.com/@elisowski/ai-agents-vs-agentic-ai-whats-the-difference-and-why-does-it-matter-03159ee8c2b4