K-Extractor™: See the scientific landscape at a glance
Know the top players in virology and mircobiology fields, what they are working on, who are they affiliated with, and who funds them.
Big Data: Volume, Variety & Velocity. Thousands of new papers are published every day, collecting this information and keeping it up to date is no longer feasible to do by hand.
Thousands of scientific research publications in PDF format, web pages with bio pages, conference schedules.
Scientific profiles - creating in depth "baseball cards" for each person with the following information:
Employment and affiliation history
Projects, co-workers and co-authors
Funding and awards
Topics of interest
Family and personal information
Each item on profile has a direct link to supporting documents from input collections. Profiles are connected to each other with hypertext links.
Document structure recognition for PDFs and other complex file formats to detect the following elements:
– title, authors and affiliations
– headers and footers
– section titles and content
– citations and references
– tables and illustrations
Deep semantic processing of the text fragments to extract the concepts and semantic relations of interest.
Resolving aliases of people, organizations and other concepts: spelling variations, short forms, synonyms, etc. Merging concepts together enables the combination of knowledge from multiple sources.
Extracted knowledge is saved into an RDF store that is available to the customer. Lastly, the RDF store is automatically queried to generate connected profiles for each person mentioned in the input collection.