June 30, 2025 • By Johannes
Everything started in San Francisco, of course. My friend Magnus invited me after he got into YC and showed me this other world. A crazy world. He also took me to some events where I met current YC founders. Most of them were building AI agents, like Magnus with BrowserUse. I also saw more and more marketing content from Artisan and their Don't hire humans campaign. They called themselves not only builders of AI agents but AI teammates. And I started thinking: what makes a real AI teammate?
I realized it's not only about functionality. Functionality is important, of course, but as with real teammates, it's not just the pure skill of a person that makes you consider them a good fit. It's also about how you can communicate with them and how they can communicate with you.
Most AI agents communicate only through website chatbots. I think that's not enough. They should be present wherever real teammates are present as well, maybe except at the coffee break. And where do we work most closely with teammates? In meetings. That's where the idea to build AI teammates that interact with you during meetings came from. They should be able to listen, to speak with you, and to solve all the tasks you would otherwise tell an AI agent to do after the meeting.
There are several upsides to that approach. First, it is more comfortable. In a normal workflow, you sit in a meeting, decide to do a task, and assign it to someone. The meeting ends, that person goes to an AI agent, tries to give all the context needed to solve the task, and the AI generates a response. Then there's the next meeting, the results are presented, and they are discussed. This is annoying and redundant. The meeting transcript already contains the context, so why repeat it?
And it's slow. Consider another case: you are in a meeting and decide on a new task, for example, drafting a presentation. You tell joinly to do it. Joinly works on it immediately while you continue with other topics. Then joinly tells you it's ready and shows you the results.
Or think about tedious tasks like scheduling a follow-up meeting - you simply tell joinly and it sets it up. Writing a follow-up email? Joinly drafts it on the spot. And when you need information live in the meeting, whether from the web or your corporate data, joinly can get it instantly. The possibilities are endless. In the future, there will be one teammate in every meeting whose job is to make it as productive as possible. That teammate will not be human anymore: it will be joinly.
From the technical side, there are two parts. First, the ability to communicate in the meeting: joinly needs to be able to speak, to write in the chat, and to understand everything in real time. Second, the functionality: the skills joinly should have to solve tasks, such as searching the web, setting up follow-up meetings, writing emails, and drafting presentations.
For the first part, we built a joinly MCP server that gives an AI agent (MCP client) the tools to communicate in the meeting and, as a resource, the real-time meeting transcript (if you want further information, have a look at our GitHub repository. For the second part, we set up a joinly agent/client that you can customize to add functionality, for example by adding your favorite MCP servers like GitHub, Tavily, and Notion. From the architecture, you can see that you do not need to use our joinly client to use the joinly MCP server. You can also connect your own agent to the joinly MCP server and make it meeting-ready.
If you work with our example joinly client which you can find on PyPi, you basically only need two commands in two different terminals to try it out. First, pull and start a Docker image with the joinly MCP server.
# Command 1
docker run -p 8000:8000 ghcr.io/joinly-ai/joinly:latest
Then start and connect the joinly client from the joinly client package, telling joinly which meeting to join. (Don't forget to also add an .env file for specifying your LLM API key.)
# Command 2
uvx joinly-client --env-file .env <MeetingUrl>
Of course, you can also modify our client. For that, have a look at the client package.
It was important to us that you can use joinly in whatever way suits you. The first decision is whether you want to run it locally or use APIs. If you want to use it locally, you can use Whisper for STT (default), Kokoro for TTS (default), and an Ollama model as the LLM. You can also connect other APIs: we recommend Deepgram for STT, ElevenLabs for TTS, and the OpenAI API as the LLM.
We're often asked how we join meetings. We join via the browser. So in principle, joinly can join any browser-based meeting, but so far we mainly cover Zoom, Google Meet, and Teams. And who knows, maybe we will create our own AI-first meeting platform so that integration becomes even smoother.
If you like the idea, show us some love by adding a GitHub star and a Twitter follow. We're excited to make your meetings more productive. See you soon!