The Database Corner

Monday @ 2pm

ABOUT ME

Marcos Bedo

Marcos Bedo is an Associated Professor in the Institute of Computing at Fluminense Federal University (IC/UFF). Marcos received his B.Sc. (2011), M.Sc. (2013), and Ph.D. (2017) from the University of São Paulo (USP). He is currently involved in research projects focused on content-based medical image retrieval, information retrieval, data-driven analytics for flood modeling and simulation, and provenance management. Some of his research interests include database design, similarity searching, medical imaging, and data engineering.

Similarity searching

Similarity search is a general paradigm that includes a variety of processes that share the principle of querying (very large) spaces of objects, where the only available comparator is the similarity between any pair of objects.

In the age of big data, similarity search can provide efficient mechanisms to query large information repositories with objects having no natural order, e.g., images and other complex data, being able to handle different data types and formats as long as they are coupled with a meaningful, metric distance function.

Although the k-Nearest Neighbors (kNN) query is the most known subclass of similarity search, other query types, such as top-k, skyline, and result diversification, often appear in the context of data retrieval, data analysis, and data mining. Typically, similarity searching methods are based on the topological framework of Metric Spaces, allowing the design of efficient and flexible indexing structures and searching algorithms.

 

Inherent problems of searching large repositories containing complex data formats dominate the research in similarity search. The problems range from data embedding to distributed and parallel algorithms, with many challenges still unsolved (such as handling distance concentration due to the curse of dimensionality) or in constant evolution (such as adding seamless support to similarity searches to Data/Data Lake Warehouses).

If you are a student with basic programming and math skills and want to learn more about this topic, you can contact me!

Join the newsletter

Location

Institute of Computing – IC/UFF
Office 531 – IC Building
Av. Gal. Milton Tavares de Souza, s/nº
São Domingos – Niterói – RJ
CEP: 24210-346

Skip to content