Building a PM Helper with AI

What are We Doing?

We’re going to look at a plaything I built in just a handful of hours while digging into agentic AI a bit just for fun. (Well, for career-related things as well, but mostly for fun.) The toy is an AI enabled Digital Product Manager assistant app where you can ask questions about product management. Which, as it turns out, could actually be a real product. I built it as a toy project for fun, but might actually soft launch this thing. Because, why not? Can you just do this with any GPT? Sure. But this one is tuned specifically to product management in general and digital product management in particular. (There are others like it though, so maybe just leave it as a personal tool. We’ll see.)

Why do this? Why bother. And why do you care? Since the vibe coding thing is so very in right now, (maybe for another few months), I figured it was time to jump in a bit. While I’ve built some agent workflows in the past and built a variety of apps, it’s usually been team based. This is one of those, “Hey, if I can do this… anyone can” type posts. The question, of course, is do you have a reason / use case? But the whole argument that used to exist about some things being “too hard” or “too technical” or “too much time” are fading away with some of these tools. Not completely. And some things absolutely – my opinion – require “real” developers. But others? Lower risk things? Personal productivity things? Not as much anymore. So I’m going to go through my process just at a super high level. My goal is to convince other product manager types to dive into this area more deeply than just watching a webinar and learning some of the lingo. Even for senior roles and beyond individual contributor roles, I personally think it’s useful to get a visceral feel for how things work. Doing so helps offer better context for what teams might be going through as well as understanding in what might be possible. And also, getting a sense of what budget implications might be if you’ve got P&L responsibility for a product.

Next we’ll get into the details, but if you want to see the end result, it’s here: (But note, the functionality won’t work as I’ve got the public webhooks turned off so as not to burn up my paid quotas on the services in use. To see it working, check the Loom Demo.)

Direct Link to the App Test Website

App Demo / Loom Video

Skills Involved: Minimal

Who am I? Long time senior level product person. Mostly I manage teams of folks across product, design, analytics, and even marketing/sales on a couple of special occasions, and of course there’s always some individual contributor work. (“Special occasions” in the Product world means no one else is on board to do it yet, so you get it started.) Once upon a time, I did do some very minor development, but haven’t written production code in approximately 3.4 million years. And even then, it was mostly front end, maybe some database work, etc. Now, that being said, I’ve defined schemas, helped with architectures, dived in fairly deep with teams. This is partly because I’ve varied between senior management and individual contributor. The point is I’m nowhere close to being a professional developer. Not by a long shot. But I built this in maybe 6 hours across a few evenings with – at best – paltry atrophied development skills. (And if I need to do something similar now, it would probably take half that time.)

What are the Inputs?

Data Sources: OpenAI model, a personal corpus of product management knowledge gathered in Atlassian Confluence pages over a couple of decades, general web search via an AI specialized search tool. (Tavily)

Tools:

OpenAI ChatGPT Large Language Model (LLM) (Paid pro account)
n8n AI Workflow Automation (Paid basic account)
Atlassian Confluence Wiki Wiki Spaces / Pages (Paid pro account)
Pinecone Vector Database Vector database for specialty content. (Free trial version)
Lovable.dev Website app builder. (Free trial version)
Tavily Websearch API tailed to LLM / Retrieval Augmented Generation (RAG) usage. (Free trail limited usage.)
GitHub to store code. (Free version)
Postman To check API calls separate from the system just for confidence. (Free version)

Time to Build: About 6 hours in total. But that includes two sessions of approximately one hour each debugging two separate problems. Both of which occurred only because I didn’t happen to know a few aspects of the back end tool I’m using. If using the same functions again, I could ideally avoid that two hour loss. (Though of course, if doing something else new, chances are something else can pop up.)

High Level Flow

Here’s the super simple high-level flow:

I created a webhook in the n8n back end agent tool so it can function as the backend for the web application, which is generated by Lovable.dev. (A simple explanation of Webhooks is they’re a means by which apps can work together.)
Then I used the Lovable web development tool to just build a front end shell with a chat interface that will be driven by results coming from n8n. This took maybe 5 – 10 minutes to write up an ok prompt for what I wanted. Then maybe only 5 minutes for the app to generate a whole front page UI. No other helper pages, (like about us and such), behind it. Just the functional part. For now.
Finally, I developed the rest of the workflow in n8n to process input with combination of LLMs, search results, (and initially, a vector database for RAG / Confluence data.) The Agent was used to orchestrate taking the output from the various data source tools and use of Merge/Aggregate nodes to put together the final result. (Update: I decided to streamline things a bit and query Confluence directly via a RESTful API search. My main reasoning for this was to avoid having to vectorize pages for RAG, avoid increased costs, and not have to manage updates/upserts from Confluence to a Pinecone database. However, if doing this at scale, it might make sense to see which would be a better option. It’s probably faster to go against an existing vector store than kick off fresh searches.)

Backend “Agentic” Workflow

Here’s the n8n backend main workflow. It was built in steps along the way to test each module as I went. But really, you don’t even need much of the website front end to test. You can see the input / output for each step in the tool. And all we really care about here is an initial question and a final output to send back. Though I liked using the website as I just got a kick out of seeing the screen update given it was so comically simple and fast to build. My grade school daughter has a snap together electronics kit and likes to make the buzzer and light activate. Same thrill for dad really. By the way, I’m not entirely sure just how agentic you can call this. I’m not using agents to choose tools or do heavy reasoning. But they’re following some instructions, so I’m going to claim it; at least at a simplistic level.

Just for fun, I made a separate sub-workflow for the personal Wiki space content retrieval. I’ve been adding to that area for years and thought it might be nice to just ask it questions rather than look things up. The yellow area above isn’t in use yet; that’s just where I was going to use a vector database for the Wiki content when I decided to cheat and just use search instead. Here’s what that sub-workflow looks like.

Front End Time

Next, we’ll look at what Lovable kicked out for the interface and its associated codebase.

Nice.

As mentioned, I’m Product, not Dev. But I have just enough dev experience to recognize what’s been created here. And if I wanted to, how to take this all out of here and load it up to my own server area and such. The CSS is reasonably clean. The design is minimalist, but that’s what I asked for. You can iterate on design with more prompts or just upload images of what you like and the tool will attempt to match styles. This can be a bit of a rabbit hole, and yet, still fairly quick to get iterations.

If I get some time soon, I’ll do a Loom or something to have video showing the app running along with how the agents do their thing.

My Take On The Experience

I put a prompt into the Lovable tool to create my website. Basically, I told it a color scheme and that I wanted an AI style chat bot interface. Plus I told it the webhook URL for the n8n backend. That was pretty much it. Next, I just pressed a button. Maybe 5 minutes later, a working web page. Very cool.
On first test, the tool was disappointed when my attempt to send a message to the back end agent failed. Or at least, it seemed sad about this, and wanted to help. So I asked Lovable to make sure the webhook URL was correct. It said it was. But then it helpfully went and tested it and asked me if I’d meant to send a DELETE request to the agent. Well, I didn’t. I wanted to do a POST, which is what I thought I’d set up. But somehow, it didn’t save like that. I have to assume I fat fingered an incorrect dropdown box option and accidentally saved it. My fault. I apologized to the AI tool, and it told me it would be ok. So really, it was my error. (Note: Yes, I know I said I’m not technical, but I do understand the basics of HTTP and APIs and such. So there are places in these workflows where some degree of technical background does help.)
In this case, I didn’t even have to trouble-shoot that particular issue. When I asked Lovable about one possible source of the problem, it said that wasn’t it, but suggested checking the real problem. It was smart enough to look at the error message that had come back from n8n and tell me about it. In my next test, (only the second) the message went through just fine. Nothing else happened on screen as I hadn’t defined the rest of the workflow yet to handle sending back a response. I just wanted to test this first step getting a message from the front end to the back end. Total set up time so far, maybe 30 minutes. And that’s just because this was my first time doing this and I had to create some new accounts and settings in the tools. (Though I was already somewhat familiar with n8n.)

For Lovable, it amusingly, (or maybe disturbingly), makes several errors, but then helps fix its own errors. If it had more fully reasoned through its initial code writing, maybe this wouldn’t have happened. And yet, just as with a person building something, maybe an error didn’t show up until a run attempt was made. I don’t know. To some degree, I could trace the errors through the logs, but I didn’t have the patience to bother. Why should I when I can just say, “Fix this please.” Sure, I’m kind of curious. And maybe I’ll go back and look. But that’s not the point of this exercise.

Early Success

Here’s a first, early success. The Lovable AI can respond end-to-end using a webhook to make the query to my n8n workflow and accept a response to give me an answer. This is a simple response using OpenAI ChatGPT 4.0 mini alone. It does not yet incorporate more sophisticated honing in the n8n workflow or any Retrieval Augmented Generation, (RAG) options that I have available. As a product, there’s no good reason to use this over just going direct to a GPT until/unless I add in custom content from my RAG sources.

Doesn’t matter. This is a big win. This is the core scaffolding. If I wanted to, I could clean up this UX a bit, (or not), slap a real URL on and there’s my app. Though I’d also probably have to up my paid quota at OpenAI and throw on some advertisements. Then the game would be the same arbitrage as any content site. Can I make more $$$ via advertising, (or subscriptions or lead generation, etc.), then I’m spending on hosting, development, etc? One way to manage costs is to limit the query sizes going to any AI tools in order to keep the token transfers lower. The real challenge with costs is likely going to be the next step of customization, as vector databases can also get expensive just as GPTs can. I like using Pinecone as a vector database, (and I did that here using a free trial), but not sure I want to pay a subscription for that just to play games here for now. (Or if I want to keep this thing for myself. Or… maybe even make it real and launch it. Crap. You know what? I think my little play project might just be about to become a real product. Sort of. That would take more work for a not very differentiated product. And the real hassle won’t be the build. It will be going back through my Confluence docs to make sure they’re clean. Did I leave any personal info in there? Any access / password info? Do I need to check for taxonomic balance? After all, the info is my brain dumps over years, the topic areas are neither mutually exclusive or exhaustive as a good taxonomy would be. Does that even matter here anyway? After all, I’m not going to be relying heavily on meta data. The vector space in the RAG database portions will layer on top of an LLM transformer. So maybe I don’t care about that level of normalization here anyway. You know what… I don’t.)

For those interested on the code side, Lovable chose to build with the following:

Vite (Front end build tool / development server)
TypeScript (Variant of JavaScript)
React (Interface components)
shadcn-ui (Additional UI libraries)
Tailwind CSS (CSS framework for styling)

The whole front end part of the project was conveniently synched to my GitHub:
https://github.com/ScottGsHub/pm-coach-helper/

Biggest Challenges & Impressions

Speed. As is often the case, the first time you learn to do something, it often goes slowly. Magically, the next time is a little better and a few times later? You wonder why all of these easy things used to take you so long. One of the best things I learned once in a ski instructor class, (that is, a class where we were learning how to teach others), was one of the hardest things about teaching is remembering what it felt like to not know something. This process basically took me about 6 hours over a few evenings. Which is about twice as long as it should have and if I have to do something similar, it’ll go much faster. On the other hand, if I was doing this the ‘old fashioned’ way, it would have taken several days. And I think that’s true even working with a designer and programmer. If I wanted to clean this thing up and make it more of a product, I’d estimate a least a week or so. Which is still fairly fast.

Knowing What’s Going On. There’s two ways to build with these tools. 1) Just somehow make it work. 2) Try to really understand what’s going on. For example, you can use the GPTs to get the code to make a call from a node to a search engine or something. But… it’s probably useful to understand at least a little bit of JavaScript and how JSON is formatted. As atrophied as any weak programming skills I might personally have may be, I think my troubleshooting went much faster than if I’d had no clue what these things were. Here’s an n8n pro tip: If you’re trying out this tool and struggling to debug a sub-workflow, it’s probably because it doesn’t look like it’s doing anything. But it probably is. In n8n, the main workflow canvas shows live execution data (like green highlights, node output previews, etc.) when you run it directly. But for sub-workflows live preview isn’t shown on the canvas, only in the Executions view. (This cost me a whole hour before ChatGPT helped figure it out.)

Powerful Danger. As I’ve said elsewhere, these tools are powerful. And magical. And amazing. And still at early stages. What you can do is just stunning. Just as with any tools, deft craftspeople will likely be able to make wonderful things. Others? Others may make things that range from bad to dangerous. The worst cases will have traits similar to some GPTs themselves. That is, they’ll be confidently wrong! The most dangerous might be those projects that get the UI/UX right and so seem buttoned up and trustworthy. Even though they have a back end that might be riddled with wrongness from just poor results, (whatever those may be), to privacy, security, and possibly regulatory failures. We’ve had Web 1.0, 2.0, 3.0, (which is arguable), then some say 5.0. This might be – for now – Web What the Hell is Happening Now Point Oh. Do you know how you sometimes still come across some cheapie little garbage website from the 1990s that is somehow still chugging along? Well, this is like that. Only different. Because the older garbage zombie websites are just sitting there minding their own business and mostly being ignored. These new things? They’e potentially active. They’re not just sitting around. They’re participating. They’re knocking on doors. They’re initiating. For all the amazing good some of these apps might put into the world, I’m not sure what scares me more, real bad actors or random failures. Why? Because as many bad guys as there are in the world, (even powerful state sponsored ones), they’re still a small percentage of the world. And there’s whole teams of good guys listening for them and ready to engage. But this other thing? It’s a great mass of creators doing the monkeys pounding on keyboard thing. Some will craft novels. Others are probably going to break things. If Crowdsourcing can be wildly successful in finding some winners, I wonder how damaging Crowdsourcing can be in accidentally exposing vulnerabilities.

Production. I’m not sure I’d trust this for any serious production yet. Note two things: 1) Others have used this to produce working product that’s out there. 2) My use here is not only beginner level, but I’ve been using a couple of paid accounts, coupled with some free API services and using test environments only. Still, I’ve found that sometimes API endpoints get dropped or need resets, and the same for some n8n workflows. It’s possible that moving to production endpoints and such is more stable. At the same time, I’d want great observability and monitoring for events. Which, of course, means relying on the tools themselves to do so, (unwise), or introducing a separate piece of software, (more complicated and likely less “no code”).

External Dependencies. Our Apps have always been dependent on various services. I’d argue the entire ‘net and everything on it is really a giant exercise in parameter passing to some depth of complexity or another. Agentic workflows are another level. These apps require a variety of access points and credentials to be accurate, live, and reliable in an ongoing manner. That’s a lot ask when many services may be outside of formal business arrangements. One potential value of using blockchain tech and crypto with AI agents is use of token economies to transact. Gas fees and such, (even with so-called “gasless” transactions), will add yet another layer of execution dependency. Builders and users will have to pay attention here. The flexibility will create a lot more potential failure points and possibly more challenging troubleshooting.

Next Steps for my Toy

I’m really not sure. Maybe nothing. I’m going to sit on this a week and see how I feel about it. If I get some time, I’ll maybe clean up the UI/UX a little and slap some advertisement on there. Then I’d need to pay for a couple of pro accounts on a few services. Setting limits to keep any runaway APIs to a couple hundred dollars seems like a good idea. Then maybe put on a URL and see what happens. Sometime late next month I’ll be having some leg surgery to fix a screw up from a little ice hockey incident. Since I’ll be mostly lying in bed for a week or more if the painkillers don’t make me too out of it, maybe I’ll have some spare time to give it a shot. If so, I think I get a Vibe Coding membership card from somewhere and join the “Look Who Else Thew Up Some Low Quality Product Code on the Internet” club. But if it works, it works. We’ll see.

More seriously, I mostly did this because a) I like to play with new tools, b) I think whether your an individual contributor or senior product person, you should have at least a clue as to how things work, and c) it really is just fun to make things with these tools. The challenge I always had with coding was the inordinate amount of time it takes to get good with syntax across multiple languages and frameworks. I still believe true production quality work, even for simple apps, likely requires some degree of ‘real’ skill and experience. Otherwise, there’s potentially some non-trivial risk, depending on the application. But for fast prototyping and testing? Yeah, this is next level.