Overview#
pyJedAI is a python framework, aiming to offer experts and novice users, robust and fast solutions for multiple types of Entity Resolution problems. It is builded using state-of-the-art python frameworks. pyJedAI constitutes the sole open-source Link Discovery tool that is capable of exploiting the latest breakthroughs in Deep Learning and NLP techniques, which are publicly available through the Python data science ecosystem. This applies to both blocking and matching, thus ensuring high time efficiency, high scalability as well as high effectiveness, without requiring any labelled instances from the user.
Key-Features#
Input data-type independent. Both structured and semi-structured data can be processed.
Various implemented algorithms.
Easy-to-use.
Utilizes some of the famous and cutting-edge machine learning packages.
Offers supervised and un-supervised ML techniques.
Install#
Warning
We are currently experiencing issues with pypi installation due to a problem with the py-stringmatching dependency. We are actively working on resolving it! A temporary solution is to first install:
pip install 'numpy>=1.7.0,<2.0' and then pip install pyjedai.
pyJedAI has been tested in Windows and Linux OS.
Basic requirements
Python version greater or equal to 3.8.
For Windows, Microsoft Visual C++ 14.0 is required. Download it from Microsoft Official site.
PyPI
Install the latest version of pyjedai:
pip install pyjedai
More on PyPI.
Git
Set up locally:
git clone https://github.com/AI-team-UoA/pyJedAI.git
go to the root directory with cd pyJedAI and type:
pip install .
Docker
Available at Docker Hub, or clone this repo and:
docker build -f Dockerfile
Tutorials#
Find all the code of the tutotials in the pyjedai/docs/tutorials folder.
Dependencies#
See the full list of dependencies and all versions used, in this file.
Status
Statistics & Info
Bugs, Discussions & News#
GitHub Discussions is the discussion forum for general questions and discussions and our recommended starting point. Please report any bugs that you find here.
Java - Web Application#
For Java users checkout the initial JedAI. There you can find Java based code and a Web Application for interactive creation of ER workflows.
JedAI constitutes an open source, high scalability toolkit that offers out-of-the-box solutions for any data integration task, e.g., Record Linkage, Entity Resolution and Link Discovery. At its core lies a set of domain-independent, state-of-the-art techniques that apply to both RDF and relational data.
Team & Authors#
Research Associate at University of Athens
Research Associate at University of Athens
Research Associate at University of Athens
Postdoctoral Researcher at University of AthensEntity Resolution expert
Assistant Professor at Tilburg University Product Matching expert
Professor at University of Athens
Research and development is made under the supervision of Pr. Manolis Koubarakis. This is a research project by the AI-Team of the Department of Informatics and Telecommunications at the University of Athens.
License#
๐ Donโt forget to cite us!
If you find this work useful, please cite us using the following reference:
@inproceedings{pyJedAI,
author = {Nikoletos, Konstantinos and Papadakis, George and Koubarakis, Manolis},
booktitle = {Demo at International Semantic Web Conference.},
series = {ISWC},
title = {{pyJedAI: a lightsaber for Link Discovery}},
year = {2022}
}
Thank you! ๐
Released under the Apache-2.0 license (see LICENSE.txt).