I tried out the Comet browser


Prelude

Back when ChatGPT first released to the public, I was unimpressed. While there was clearly something better about it compared to prior AI releases, it was still seriously lacking in its abilities and value proposition. Midjourney, with it’s contemporaneous virality, struck me as the cooler trend, and the only times I used ChatGPT was when I needed the name of something which I could only describe. Years of empty promises from Amazon and Google had made me skeptical about AI, and ChatGPT seemed to be another miss.

Years of empty promises from Amazon and Google had made me skeptical about AI, and ChatGPT seemed to be another miss.

After two years of largely ignoring the AI hype, I heard about DeepSeek. DeepSeek was making waves as a relative dark horse entering the AI race out of the east. After trying it out, I started to see the value of LLMs: DeepSeek thoroughly and accurately investigated anything, including politically sensitive topics, and its self-censorship would only trigger if I really tried to set it off. Yet, I still wasn’t very excited about AI because I prefer to do research myself. At most, what I saw in DeepSeek was a starting point for research, or a good way to summarize information on low-stakes issues.

Once again, time goes by and my attention is elsewhere. Then, an article about an “AI web browser” hits my feed. I dismissed it instantly and filed that headline under the “probably clickbait, or a waste of time” folder in my brain. But I was intrigued when another headline mentioned that the AI in the web browser was “agentic”. That term was new to me, but I couldn’t resist the implication that a real AI model available to the public can act on its own.

The more I read about the agentic AI browser, Comet, the more interested I became. How does it work? What can it do? I had to find out.

First contact

Imagine my disappointment when I saw that Perplexity was charging $200 per month for early access to Comet. Thinking back to all of the other AI vaporware, $200 was more than I was willing to bet on a potential gimmick. Disappointed, I joined the waitlist and crossed my fingers.

I couldn’t resist the implication that a real AI model available to the public can act on its own.

My wish was granted in under two weeks. Early on a Monday in the middle of August, I received notification that Perplexity had let me off of their waitlist and sent me a download link for Comet. I downloaded it and began seeing what it could do.

First impressions

It took me a while to figure out how to engage the agentic side of Comet. The “Assistant” button didn’t seem to work right away, which confused me. But I did appreciate the convenience of having access to most of the major models right on the new tab page.

Comet browser new tab Comet by Perplexity: the default page is very sparse, but they make it easy to choose between models from Anthropic, OpenAI, and xAI

After figuring out how to access Comet Assistant, I set it to work. It’s first task was to create an account on my website, read the documentation, and write a Python program that uses the authentication API. I had a new account and valid Python code returned to me in under three minutes. Unfortunately, the Assistant was confused by the different API versions, but that mistake was quickly corrected when I specified the which version to use.

I demoed Comet Assistant at work, and I gave it a hard task for the demo. I told it to find parts on Digikey which match specific criteria, and the criteria had no search filter for the parameter. Once the Assistant caught onto the lack of a search filter, it scanned a number of PDF documents and returned five correct parts.

Just to see if it could be useful for automated tests, I asked Comet Assistant to change a setting in a React app. This was its first big blunder. From what I can tell, Comet Assistant uses URL and DOM manipulation instead of mouse clicks and keyboard input. This is a problem for React or other environments where JavaScript is used heavily. Comet could only scroll through the page and select text inputs, which made it useless for testing.

As a final test, I asked it to create a license key on my website and then sign up as a user with that key. The goal it was given was to evaluate the user experience of the whole process. While it did eventually give me feedback, it was overly nice. The feedback it gave makes me think that it doesn’t really “see” the web in a graphical sense, which limits its usefulness. On the plus side, it prompted me for confirmation before pressing a button labeled “delete”. While such prompts could become overbearing for certain tasks, I really appreciate seeing this sort of thoughtful design from the team at Perplexity.

Conclusion

As someone who has been broadly skeptical of the promises made by AI companies, I have to applaud Perplexity for changing my mind. While Comet is still extremely new to the market, it shows immense promise. I rate Comet a 6/10, because while it succeeds at many things, it also falls short of its lofty goals. Comet is not a mature product, but its story has barely been written. If things continue on this path then it will become truly great.

Specifically, Comet Assistant could benefit from learning how to click and type, and stay on task for longer. These changes alone would drastically increase its versatility. Comet Assistant should also have the ability to view a rendered page. Perhaps this could be handled locally to help with speed—Comet is a full-fledged program, so it’s not very limited on resources.

Postscript

The “chat” modality of LLMs doesn’t feel very natural due to the variety of questions and answers they can handle. These models are capable of teaching, writing, research, code generation and debugging, and companionship. Forcing all of that to go through an interface that looks just like Instagram DMs is extremely awkward and is absolutely an area ripe for improvement. To some extent, it already is being improved by Comet and a plethora of other startups selling AI wrappers. A truly mature AI/LLM product offering should be able to integrate right into video players, document editors, web browsers, IDEs and existing social media apps. You wouldn’t expect your coworker to type a whole research report or code review into Slack—we have Word and Git for that—so why should you lower your expectations for AI?