fbpx
Wikipedia

deepset

deepset is an enterprise software vendor that provides developers with the tools to build production-ready natural language processing (NLP) systems. It was founded in 2018 in Berlin by Milos Rusic, Malte Pietsch, and Timo Möller.[1] deepset authored and maintains the open source software Haystack[2] and its commercial SaaS offering deepset Cloud.[3]

deepset
Company typePrivate
IndustryNatural Language Processing
FoundedJune 22, 2018; 5 years ago (2018-06-22)
Founders
  • Milos Rusic
  • Malte Pietsch
  • Timo Möller
Headquarters,
ProductsHaystack, deepset Cloud
Number of employees
> 50
Websitewww.deepset.ai

History edit

In June 2018, Milos Rusic, Malte Pietsch, and Timo Möller co-founded deepset in Berlin, Germany.[1] In the same year, the company served first customers who wanted to implement NLP services by tailoring BERT language models to their domain.

In July 2019, the company released the initial version of the open source software FARM.[4]

In November 2019, the company released the initial version of the open source software Haystack.[2]

Throughout 2020 and 2021 deepset published several applied research papers at EMNLP, COLING and ACL, the leading conferences in the area of NLP. In 2020, the research contributions comprised German language models named GBERT and GELECTRA,[5] and a question answering dataset addressing the COVID-19 pandemic called COVID-QA, which was created in collaboration with Intel and has been annotated by biomedical experts.[6]

In 2021, the research contributions comprised German models and datasets for question answering and passage retrieval named GermanQuAD and GermanDPR,[7] a semantic answer similarity metric,[8] and an approach for multimodal retrieval of texts and tables to enable question answering on tabular data.[9] Haystack contains implementations of all three contributions, enabling the use of the research through the open source framework.

In November 2021, the development of the FARM framework was discontinued and its main features were integrated into the Haystack framework.[4]

In April 2022, the company announced its commercial SaaS offering deepset Cloud.[3]

As of August 2023, the most popular finetuned language model created by deepset was downloaded more than 52 million times.[10]

Products and applications edit

Haystack is an open source Python framework for building custom applications with large language models. With its modular building blocks, software developers can implement pipelines to address various search tasks over large document collections, such as document retrieval, semantic search, text generation, question answering, or summarization. It integrates with Hugging Face Transformers, Elasticsearch, OpenSearch, OpenAI, Cohere, Anthropic and others. The framework has an active community on Discord with more than 1.8k members and GitHub, where so far more than 200 people contributed to its continuous development,[11] and it also enjoys a vibrant community on Meetup.[12] Thousands of organizations use the framework, including Global 500 enterprises like Airbus, Intel, Netflix, Apple, or Infineon, Alcatel-Lucent Enterprise, BetterUp, Etalab, Sooth.ai, and Lego.[13][14]

The deepset Cloud platform supports customers at building scalable NLP applications by covering the entire process of prototyping, experimentation, deployment, and monitoring.[15] It is built on Haystack.

FARM was a framework for adapting representation models.[4] One of its core concepts was the implementation of adaptive models, which comprised language models and an arbitrary number of prediction heads. FARM supported domain-adaptation and finetuning of these models with advanced options, for example gradient accumulation, cross-validation or automatic mixed-precision training. Its main features were integrated into Haystack in November 2021, and its development was discontinued at that time.[16]

Funding edit

On August 9, 2023, deepset announced a Series B investment round of $30 million led by Balderton Capital and including participation from existing investors GV, System.One, Lunar Ventures and Harpoon Ventures.[17][18][19][20] On April 28, 2022, deepset announced a Series A investment round of $14 million led by GV, with the participation of Harpoon Ventures, Acequia Capital and a team of experienced commercial open source software and machine learning founders, such as Alex Ratner (Snorkel AI), Mustafa Suleyman (Deepmind), Spencer Kimball (Cockroach Labs), Jeff Hammerbacher (Cloudera) and Emil Eifrem (Neo4j).[1] A previous pre-seed investment round of $1.6 million on March 8, 2021, was led by System.One and Lunar Ventures, who also participated in the subsequent Series A round.

References edit

  1. ^ a b c Wiggers, Kyle (April 28, 2022). "Deepset raises $14M to help companies build NLP apps". TechCrunch. Retrieved August 31, 2022.
  2. ^ a b "deepset-ai/haystack". GitHub. Retrieved August 31, 2022.
  3. ^ a b "deepset Cloud". deepset. Retrieved August 31, 2022.
  4. ^ a b c "deepset-ai/FARM". GitHub. Retrieved August 31, 2022.
  5. ^ Chan, Branden; Schweter, Stefan; Möller, Timo (2020). "German's Next Language Model". Proceedings of the 28th International Conference on Computational Linguistics. Barcelona, Spain (Online): International Committee on Computational Linguistics. pp. 6788–6796. doi:10.18653/v1/2020.coling-main.598.
  6. ^ Möller, Timo; Reina, Anthony; Jayakumar, Raghavan; Pietsch, Malte (2020-07-09). "COVID-QA: A Question Answering Dataset for COVID-19". Proceedings of the 1st Workshop on NLP for COVID-19 at ACL 2020. Online: Association for Computational Linguistics.
  7. ^ Möller, Timo; Risch, Julian; Pietsch, Malte (2021). "GermanQuAD and GermanDPR: Improving Non-English Question Answering and Passage Retrieval". Proceedings of the 3rd Workshop on Machine Reading for Question Answering. Punta Cana, Dominican Republic: Association for Computational Linguistics: 42–50. arXiv:2104.12741. doi:10.18653/v1/2021.mrqa-1.4.
  8. ^ Risch, Julian; Möller, Timo; Gutsch, Julian; Pietsch, Malte (2021). "Semantic Answer Similarity for Evaluating Question Answering Models". Proceedings of the 3rd Workshop on Machine Reading for Question Answering. Punta Cana, Dominican Republic: Association for Computational Linguistics: 149–157. arXiv:2108.06130. doi:10.18653/v1/2021.mrqa-1.15.
  9. ^ Kostić, Bogdan; Risch, Julian; Möller, Timo (2021). "Multi-modal Retrieval of Tables and Texts Using Tri-encoder Models". Proceedings of the 3rd Workshop on Machine Reading for Question Answering. Punta Cana, Dominican Republic: Association for Computational Linguistics: 82–91. arXiv:2108.04049. doi:10.18653/v1/2021.mrqa-1.8.
  10. ^ "deepset/roberta-base-squad2 · Hugging Face". huggingface.co. Retrieved October 12, 2022.
  11. ^ "Contributors to deepset-ai/haystack". GitHub. Retrieved August 31, 2022.
  12. ^ "Open NLP Group". Meetup. Retrieved August 31, 2022.
  13. ^ Laughlin, Eleni (April 28, 2022). "deepset Raises $14 Million Series A Led By GV for Advanced NLP Platform". Business Wire. Retrieved August 31, 2022.
  14. ^ "Who uses Haystack". GitHub. Retrieved August 31, 2022.
  15. ^ "deepset Cloud". VentureBeat. 28 April 2022. Retrieved November 1, 2022.
  16. ^ Zhou, Jiayuan; Pacheco, Michael; Wan, Zhiyuan; Xia, Xin; Lo, David; Wang, Yuan; Hassan, Ahmed E. (2021). "Finding A Needle in a Haystack: Automated Mining of Silent Vulnerability Fixes". 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). pp. 705–716. doi:10.1109/ase51524.2021.9678720. ISBN 978-1-6654-0337-5. S2CID 246081539. Retrieved 2023-11-13.
  17. ^ "Deepset raises $30M to help enterprises unlock the value of LLMs". VentureBeat. 9 August 2023. Retrieved August 22, 2023.
  18. ^ "Deepset secures $30M to expand its LLM-focused MLOps offerings". TechCrunch. 9 August 2023. Retrieved August 22, 2023.
  19. ^ "Deepset, an AI startup that helps companies build apps with LLMs, just raised $30 million with this 12-slide pitch deck". Business Insider. Retrieved August 22, 2023.
  20. ^ "Deepset raises $30 million to help the world's biggest companies leverage LLM promise". Balderton. 9 August 2023. Retrieved August 22, 2023.

External links edit

  • Official website
  • Deepset-ai on GitHub

deepset, enterprise, software, vendor, that, provides, developers, with, tools, build, production, ready, natural, language, processing, systems, founded, 2018, berlin, milos, rusic, malte, pietsch, timo, möller, authored, maintains, open, source, software, ha. deepset is an enterprise software vendor that provides developers with the tools to build production ready natural language processing NLP systems It was founded in 2018 in Berlin by Milos Rusic Malte Pietsch and Timo Moller 1 deepset authored and maintains the open source software Haystack 2 and its commercial SaaS offering deepset Cloud 3 deepsetCompany typePrivateIndustryNatural Language ProcessingFoundedJune 22 2018 5 years ago 2018 06 22 FoundersMilos RusicMalte PietschTimo MollerHeadquartersBerlin GermanyProductsHaystack deepset CloudNumber of employees gt 50Websitewww wbr deepset wbr ai Contents 1 History 2 Products and applications 3 Funding 4 References 5 External linksHistory editIn June 2018 Milos Rusic Malte Pietsch and Timo Moller co founded deepset in Berlin Germany 1 In the same year the company served first customers who wanted to implement NLP services by tailoring BERT language models to their domain In July 2019 the company released the initial version of the open source software FARM 4 In November 2019 the company released the initial version of the open source software Haystack 2 Throughout 2020 and 2021 deepset published several applied research papers at EMNLP COLING and ACL the leading conferences in the area of NLP In 2020 the research contributions comprised German language models named GBERT and GELECTRA 5 and a question answering dataset addressing the COVID 19 pandemic called COVID QA which was created in collaboration with Intel and has been annotated by biomedical experts 6 In 2021 the research contributions comprised German models and datasets for question answering and passage retrieval named GermanQuAD and GermanDPR 7 a semantic answer similarity metric 8 and an approach for multimodal retrieval of texts and tables to enable question answering on tabular data 9 Haystack contains implementations of all three contributions enabling the use of the research through the open source framework In November 2021 the development of the FARM framework was discontinued and its main features were integrated into the Haystack framework 4 In April 2022 the company announced its commercial SaaS offering deepset Cloud 3 As of August 2023 the most popular finetuned language model created by deepset was downloaded more than 52 million times 10 Products and applications editHaystack is an open source Python framework for building custom applications with large language models With its modular building blocks software developers can implement pipelines to address various search tasks over large document collections such as document retrieval semantic search text generation question answering or summarization It integrates with Hugging Face Transformers Elasticsearch OpenSearch OpenAI Cohere Anthropic and others The framework has an active community on Discord with more than 1 8k members and GitHub where so far more than 200 people contributed to its continuous development 11 and it also enjoys a vibrant community on Meetup 12 Thousands of organizations use the framework including Global 500 enterprises like Airbus Intel Netflix Apple or Infineon Alcatel Lucent Enterprise BetterUp Etalab Sooth ai and Lego 13 14 The deepset Cloud platform supports customers at building scalable NLP applications by covering the entire process of prototyping experimentation deployment and monitoring 15 It is built on Haystack FARM was a framework for adapting representation models 4 One of its core concepts was the implementation of adaptive models which comprised language models and an arbitrary number of prediction heads FARM supported domain adaptation and finetuning of these models with advanced options for example gradient accumulation cross validation or automatic mixed precision training Its main features were integrated into Haystack in November 2021 and its development was discontinued at that time 16 Funding editOn August 9 2023 deepset announced a Series B investment round of 30 million led by Balderton Capital and including participation from existing investors GV System One Lunar Ventures and Harpoon Ventures 17 18 19 20 On April 28 2022 deepset announced a Series A investment round of 14 million led by GV with the participation of Harpoon Ventures Acequia Capital and a team of experienced commercial open source software and machine learning founders such as Alex Ratner Snorkel AI Mustafa Suleyman Deepmind Spencer Kimball Cockroach Labs Jeff Hammerbacher Cloudera and Emil Eifrem Neo4j 1 A previous pre seed investment round of 1 6 million on March 8 2021 was led by System One and Lunar Ventures who also participated in the subsequent Series A round References edit a b c Wiggers Kyle April 28 2022 Deepset raises 14M to help companies build NLP apps TechCrunch Retrieved August 31 2022 a b deepset ai haystack GitHub Retrieved August 31 2022 a b deepset Cloud deepset Retrieved August 31 2022 a b c deepset ai FARM GitHub Retrieved August 31 2022 Chan Branden Schweter Stefan Moller Timo 2020 German s Next Language Model Proceedings of the 28th International Conference on Computational Linguistics Barcelona Spain Online International Committee on Computational Linguistics pp 6788 6796 doi 10 18653 v1 2020 coling main 598 Moller Timo Reina Anthony Jayakumar Raghavan Pietsch Malte 2020 07 09 COVID QA A Question Answering Dataset for COVID 19 Proceedings of the 1st Workshop on NLP for COVID 19 at ACL 2020 Online Association for Computational Linguistics Moller Timo Risch Julian Pietsch Malte 2021 GermanQuAD and GermanDPR Improving Non English Question Answering and Passage Retrieval Proceedings of the 3rd Workshop on Machine Reading for Question Answering Punta Cana Dominican Republic Association for Computational Linguistics 42 50 arXiv 2104 12741 doi 10 18653 v1 2021 mrqa 1 4 Risch Julian Moller Timo Gutsch Julian Pietsch Malte 2021 Semantic Answer Similarity for Evaluating Question Answering Models Proceedings of the 3rd Workshop on Machine Reading for Question Answering Punta Cana Dominican Republic Association for Computational Linguistics 149 157 arXiv 2108 06130 doi 10 18653 v1 2021 mrqa 1 15 Kostic Bogdan Risch Julian Moller Timo 2021 Multi modal Retrieval of Tables and Texts Using Tri encoder Models Proceedings of the 3rd Workshop on Machine Reading for Question Answering Punta Cana Dominican Republic Association for Computational Linguistics 82 91 arXiv 2108 04049 doi 10 18653 v1 2021 mrqa 1 8 deepset roberta base squad2 Hugging Face huggingface co Retrieved October 12 2022 Contributors to deepset ai haystack GitHub Retrieved August 31 2022 Open NLP Group Meetup Retrieved August 31 2022 Laughlin Eleni April 28 2022 deepset Raises 14 Million Series A Led By GV for Advanced NLP Platform Business Wire Retrieved August 31 2022 Who uses Haystack GitHub Retrieved August 31 2022 deepset Cloud VentureBeat 28 April 2022 Retrieved November 1 2022 Zhou Jiayuan Pacheco Michael Wan Zhiyuan Xia Xin Lo David Wang Yuan Hassan Ahmed E 2021 Finding A Needle in a Haystack Automated Mining of Silent Vulnerability Fixes 2021 36th IEEE ACM International Conference on Automated Software Engineering ASE pp 705 716 doi 10 1109 ase51524 2021 9678720 ISBN 978 1 6654 0337 5 S2CID 246081539 Retrieved 2023 11 13 Deepset raises 30M to help enterprises unlock the value of LLMs VentureBeat 9 August 2023 Retrieved August 22 2023 Deepset secures 30M to expand its LLM focused MLOps offerings TechCrunch 9 August 2023 Retrieved August 22 2023 Deepset an AI startup that helps companies build apps with LLMs just raised 30 million with this 12 slide pitch deck Business Insider Retrieved August 22 2023 Deepset raises 30 million to help the world s biggest companies leverage LLM promise Balderton 9 August 2023 Retrieved August 22 2023 External links editOfficial website Deepset ai on GitHub Retrieved from https en wikipedia org w index php title Deepset amp oldid 1212347548, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.