Back

Building an Ecommerce Chatbot Prototype for Jem&fix

2026-05-12Tommy Jepsen

Building an Ecommerce Chatbot Prototype for Jem&fix

I built a working ecommerce chatbot prototype for Jem&fix, a Danish hardware store chain, to explore how good a product-finding assistant could actually feel when grounded in real product data, real prices, and real store-level inventory.

The short version: a customer can describe a project in plain Danish, like "I want to hang a baby swing in a beam", and the bot figures out what they actually need to do it. It recommends the right hook, tells them the price, how many are in stock at their store, and exactly which aisle and shelf to find it on.

Here is how it works under the hood.

The Stack

  • Data source: Jem&fix's own public product API (scraped per category)
  • Database: Postgres with the pgvector extension
  • Embeddings: Local open-source embedding model via @huggingface/transformers (1024 dimensions)
  • Query understanding & answer generation: OpenAI GPT
  • API layer: Express + Drizzle ORM
  • Conversation memory: A small thread/messages table for multi-turn context

No external vector DB, no LangChain, no agent framework. Just Postgres, a local embedding model, and a few OpenAI calls glued together.

1. Pulling the Catalogue

The first job was getting the products. Jem&fix has a JSON API behind their category pages, so I walked their sitemap to collect category numbers and then paged through ProductsByCategoryAndFilters to dump every product into Postgres.

For each product I store:

  • Title, description, URL, image
  • Price (including campaign/discount fields)
  • Store placement: section, bookcase, shelf and stock count per store

Store placement is the part that makes this feel less like a chatbot and more like a real assistant. When the bot tells you "Aisle 7, shelf 2, 14 in stock", that comes straight from the same data the staff use.

2. Embedding the Products Locally

For each product I generate a 1024-dim embedding from a concatenation of title, description and price, using a Mistral embedding model running locally through @huggingface/transformers:

const extractor = await pipeline("feature-extraction", MODEL_PATH);
const output = await extractor(text, { pooling: "mean" });
const embedding = output.tolist()[0];

await pgPool.query(
  'UPDATE "product" SET embedding = $1 WHERE id = $2',
  [pgvector.toSql(embedding), product.id]
);

Running the embedding model locally means zero per-product cost and no rate limits while indexing a few thousand SKUs. I batched it 50 products at a time with a connection pool, and the whole catalogue trains in one pass.

3. Understanding the Customer's Question

The naïve approach (embed the user's question, do a vector search, done) falls apart fast. People don't ask "armeringsmørtel". They ask "what do I use to fill cracks in concrete?" or "I want to hang a baby swing".

So before any retrieval happens, I ask OpenAI GPT to extract candidate product names from the message:

Customer: "I need some screws for a sandbox"
→ "Galvanised wood screws, Wood screws, Screws"

Customer: "I need a screw to hang a baby swing"
→ "Screw hook, Swing hook, Baby swing hook"

Customer: "What's the weather like?"
→ "ingen"   (none)

ingen ("none") is the escape hatch. If the message isn't shopping intent, I skip retrieval entirely.

4. Hybrid Search: Vectors + Trigrams

With candidate product names in hand, retrieval runs in two layers:

Vector search for semantic matches:

SELECT id, title, price, url,
       1 - (embedding <=> $1) AS similarity
FROM product
ORDER BY embedding <=> $1
LIMIT $2;

I filter to similarity > 0.8 to avoid garbage matches.

Trigram search as a fallback for short or brand-name-y queries. Vector search is great at concepts but terrible at "M8 bolt" or a specific SKU prefix:

SELECT *, similarity(title, $1) AS similarity
FROM product
WHERE title % $1
ORDER BY similarity DESC
LIMIT $2;

I split the LLM-suggested product names on commas and run each one through both. Then I dedupe by product id and join in the store placement row.

5. Letting the LLM Filter

Even with both layers, retrieval is noisy. So I do an evaluation pass: send the candidate titles back to OpenAI GPT along with the original question and ask it which ones are actually relevant. Anything it doesn't list gets dropped.

This is the cheapest reliable way I have found to clean up retrieval noise. The model sees titles only (not embeddings, not scores) and decides what a human shopper would consider on-topic.

6. Writing the Answer

The surviving products get formatted into a structured block with price, stock, aisle and shelf, then handed to OpenAI GPT with a Danish system prompt:

Du er en ekspert butiksassistent for jem&fix byggemarked. Hjælp kunder med at finde de rigtige produkter. Nævn specifikt pris, lager og placering når relevant.

Temperature is set low (0.3) because customers don't want a creative assistant, they want a correct one.

7. Threads & Conversation Memory

A ThreadManager class persists every user message and assistant response in Postgres. On each request I pull the last 5 messages and prepend them to the OpenAI call. That gives multi-turn behaviour without any frontend state. You can ask "do you have it in green?" as a follow-up and it works.

What the Chatbot Can Actually Do

  • Help customers figure out how to build or fix something, then recommend the exact products they need to do it. "How do I patch a hole in drywall?" becomes a short how-to plus the filler, sanding block and primer to buy
  • Answer natural-language product questions in Danish, semantically grounded in the real catalogue
  • Surface live prices and discount/campaign info
  • Tell the customer exactly where the product sits in the physical store (section, shelf, stock count)
  • Maintain context across a multi-turn conversation
  • Recognise non-shopping messages and respond conversationally without forcing retrieval
  • Serve a separate /bestoffers endpoint that ranks the top campaign savings across the catalogue

The biggest takeaway for me though is how far you can get with Postgres + pgvector + a small local embedding model + a few well-placed LLM calls. No vector DB, no orchestration framework, no agent loop. Just a pipeline you can read top to bottom in a single file.

Follow me on LinkedIn for more updates and stories