Vector databases are currently at the peak of popularity, as evidenced by the growing number of startups entering the segment and investors opting for this piece of the pie. The proliferation of large language models (LLMs) and the generative AI (GenAI) movement have created a favorable environment for the development of vector database technologies.
Traditional vs Vector Databases
While traditional relational databases like Postgres or MySQL are well-suited to structured data – predefined data types that can be neatly organized into rows and columns – this method is not as effective for unstructured data like images, videos, emails, social media posts, and any data that doesn't adhere to a predefined data model.
Vector databases, on the other hand, store and process data in the form of vector nesting, which translates text, documents, images, and other data into numerical representations that capture the meaning and relationships between individual data points. This is ideal for machine learning because the database stores data spatially based on how each item is relevant to the others, making it easier to retrieve semantically similar data.
Practical use of vector databases
This is especially useful for LLMs like OpenAI’s GPT-4, as it allows an AI chatbot to better understand the context of a conversation by analyzing previous similar conversations. Vector search is also useful for all kinds of real-time applications, such as content recommendations in social networks or e-commerce applications, as it can track what the user has searched for and retrieve similar items in an instant.
Vector search can also help reduce "hallucinations" in LLM applications by providing additional information that may not have been available in the original training data set.
“You can still develop AI/ML applications without using vector similarity search, but you will have to do more pre-training and tuning,” Andrej Zajarni, CEO and co-founder of startup Qdrant, explained to TechCrunch. “Vector databases come in handy when you have a large data set and need a tool to work efficiently and conveniently with vector inserts.”
Qdrant raised $28 million in funding earlier this year to capitalize on the growth that led it to become one of the 10 fastest-growing commercial open source startups last year. And it’s certainly not the only vector database startup to raise money recently — Vespa, Weaviate, Pinecone, and Chroma all raised $200 million in various vectors last year.
New players on the market
Since the beginning of the year, we’ve also seen Index Ventures lead a $9.5 million startup round in Superlinked, a platform that converts complex data into vector inserts. A few weeks ago, Y Combinator (YC) unveiled its Winter 2024 cohort, which included Lantern, a startup that sells a hosted vector search engine for Postgres.
Marqo, on the other hand, raised a $4.4 million seed round late last year, which was quickly followed by a $12.5 million Series A in February. The Marqo platform provides a full range of vector tools out of the box, including vector generation, storage, and retrieval, allowing users to bypass third-party tools from companies like OpenAI or Hugging Face, all offered through a single API.
Marqo co-founders Tom Hamer and Jesse N. Clark previously worked in engineering roles at Amazon, where they recognized a “huge unmet need” for semantic, flexible search across modalities like text and images, which is why they decided to start Marqo in 2021.
“Working with visual search and robotics at Amazon was a moment where I really looked at vector search – I was thinking about new ways to do product discovery, and that very quickly narrowed down to vector search,”
"In robotics, I used multi-modal search to go through a large number of our images to identify if there were any random things like hoses and packages. That would have been very difficult to solve otherwise."

Entering the business
Although vector databases are having their moment amidst the hype surrounding ChatGPT and the GenAI movement, they are not a panacea for every enterprise search situation.
“Specialized databases are typically fully focused on specific use cases and can therefore design their architecture for performance on the required tasks as well as user experience, compared to general-purpose databases that must incorporate it into the current design,” explained Peter Zajcev, founder of database support and services company Percona.
While specialized databases may excel at one thing at the expense of others, we are starting to see database giants like Elastic, Redis, OpenSearch, Cassandra, Oracle, and MongoDB adding vector search knowledge to the mix, as are cloud service providers like Microsoft's Azure, Amazon's AWS, and Cloudflare.
Zaitsev compares this latest trend to what happened with JSON more than a decade ago, when web apps became more common and developers needed a language-independent data format that was easy for humans to read and write. In that case, a new class of databases emerged in the form of document databases like MongoDB, while existing relational databases also introduced support for JSON.
“I think the same thing will probably happen with vector databases,” Zaitsev said. “Users who are building very complex and large AI applications will use specialized vector search databases, while people who need to build a piece of AI functionality for their existing application are probably more inclined to use vector search functionality in the databases they already use.”
But Zajarni and his colleagues at Qdrant are betting that native solutions built entirely around vectors will provide the “speed, memory safety, and scale” that will be needed as vector data explodes, compared to companies that add vector search as an afterthought.
“Their argument is, ‘If necessary, we can also do vector search,’” Zajarni said. “Our argument is, ‘We do advanced vector search in the best possible way.’ It’s about specialization. In fact, we recommend starting with whatever database you already have in your technology stack. At some point, users will run into limitations if vector search is a critical part of your solution.”
While vector databases are having their moment, it's important to remember that they're not a panacea for all applications. However, they are a tool that can be extremely useful in the right context, and their use is likely to continue to expand as AI technology advances.