- Improved the error messages associated with common error conditions generated through the REST API to make it easier for developers to identify the issue causing the request to fail
- Added a REST API endpoint to allow users to remove all global identifiers that have been generated in the repository
- Added the ability to retrieve a filtered list of update notifications by filtering by end event date, source event type, and transition type
- Added a REST API endpoing for updating records that is more consistent with REST API standards
- Standardized the formatting of creation dates for identifier domains
- Enhanced the validation of custom field resources created through the REST API to ensure configuration parameters for certain transformation functions are correct
- Fixed the update operation of a role to ensure that the permissions list is updated correctly through the REST API
- Added an option to the HL7v3 PDQ Supplier to allow the user to collapse multiple records that have been linked together into a single subject with multiple identifiers
- Made changes to the software installer so that it works on Windows machines
- Fixed the parsing of dates into strings to avoid concurrency related errors under very heavy workloads
- Fixed the update user account operation to prevent an issue that was causing it to fail under certain conditions
- The findByIdentifiers interface was not always returning the specified number of records through the paging parameters in the case where the query results included identifiers marked for deletion. The interface was fixed to always return the correct number of records based on the paging parameters.
Added support to the probabilistic algorithm to allow manually specified matching rules for null-scored patterns to take precedence over the rule for the same vector pattern without null scoring applied
Added a new REST interface to allow the caller to request that the matching algorithm reevaluates a record in its association with other records without requiring that the caller updates the record
Added support to the probabilistic algorithm for specifying manual matching rules for null-scored vector patterns
Added some minor improvements to log file management in the embedded instance of the Tomcat server
- Updated the implementation of the PIXm and PDQm implementations to reflect the latest versions of the iHE specifications
- Updated the PIX and PDQ HL7v3 implementation to bring them up to compliance with the latest specification
- The PIXv3 query response does not include the assigningAuthorityName attribute when present
- The PDQv3 interface doesn't support the dateOfBirth attribute when the datatype for the date of birth is a string (new default in version 3.6.x)
- Invalid audit message format according to the latest IHE/DICOM Specification
- PIXv3 Query Result is not respecting the latest IHE Spec: The missing "QueryAck SHALL have a statusCode element" was a bug
- The ATNA audit messages are not consistent with the latest IHE specifications
- Deleting an entity fails if it already has entity groups and job queue entries associated with the new entity
- Validate a new entity to ensure that includes the required fields and at least one attribute
- The REST authentication filter was not sending an appropriate error message when the session ID is blank
- In the probabilistic algorithm with a manual vector configuration and debug enabled, there is an exception
- The update operation on the User Files resource was resetting the dateCreated field
- The REST call to get user files should expose the name field of the File object
- Importing a file through the Record resource should validate that either userFileId or filename are present in the request (either reference a previously uploaded file or upload one)
- Manual matching override rules for null-scored patterns should take precedence over regular rules
- Add support for creating and dropping indexes through the REST API
- Replace deprecated Hibernate API with newer version
- Add the entityVersionId attribute to the blocking configuration object in addition to the legacy entityName attribute.
- Extend the CustomFields resource to support updates of Custom Fields rather than expect the user to delete the entry and recreate it.
- Add the ability to initiate a long running export records of entity operation via the REST API
- Updating an entity through the REST API should allow you to update attributes as well without having to use the entity attribute resource
- Add a REST resource to start and stop the PIX/PDQ service
- Need to add an operation on the Security resource that authenticates and returns the User resource associated with the authenticated user
- Migrated the persistence layer for the graph database to use the latest version of OrientDB 3.0.x
Performed load testing of OpenEMPI with the new OrientDB 3.0 persistence layer; results will be posted on the OpenEMPI web site
Upgraded the implementation of the REST API to the latest edition of Jersey. This introduced some incompatibilities with the OpenEMPI REST API in 3.5.x but unfortunately this is unavoidable
Exposed the audit logs as a REST resource through the REST API
Added to the audit service and persistence layer support for auditing by entity type. The events are now always be returned for a specific entity
Added the ability to specify that a scheduled task should perform its work against a specific, configurable entity
Exposed the Blocking Configuration through the REST API as a resource
Added default blocking and matching configurations when a new entity is created
Replaced the old cache library with a new one that is lighter
Added a REST API to allow a user to reevaluate a record's association to other records without the need for an update operation
Exposed the deterministic and probabilistic matching configurations as REST resources through the REST API
Exposed the User object as a REST Resource through the REST API
Enhanced the installation processes to include support for Apache HTTP installation
Exposed the Logged Links service as a REST resource through the REST API
Exposed the data profile service as a REST Resource through the REST API
Make sure all long-running operations invoked through the REST API create jobs and run asynchronously
Exposed the Job Queue service as a REST resource through the REST API
Added support for manual classification rules of vector patterns that correspond to null scored patterns. This is now available through the mpi-config.xml file
Added to the service layer the ability to retrieve a record along with all its inactive (voided) identifiers
Upgraded the web services test suite to Jersey 2
Fixed an issue where in certain cases when creating a custom field on a new instance caused an exception
Fixed an issue where a concurrent mod exception was generated in creating logged links during load test
Fixed an issue where to blocking algorithm was reporting not being able to load from blocks records that have been deleted. Since the records had been marked as deleted the blocking algorithm was not able to generate index entries for them
Fixed an issue with the blocking service not making use of the consumer queue wait time parameter
Replaced the implementation of the Hibernate's legacy Criteria API which has been deprecated
Upgrade older versions of dependencies on libraries that had been marked with security concerns
- Fixed an issue where under certain conditions when a probable link is updated to a match state, an update notification would not be generated
- Added support to the REST API for being able to specify what type of record links should be returned in association with a specific record (request match, probable, or both)
- Added support to the REST API for being able to specify what type of record links should be returned from the record link resource (request match, probable, or both)
- Applied a number of changes to improve the performance of the REST API
- Fixed the issue where in certain cases uploading an entity definition was not creating a user file entry of entity type so the file entry was not showing up and an entity definition to be imported
- Fixed an issue with the PDQ (HL7v2 binding) interface not returning the dateOfBirth field after the default data type for the date of birth field was changed from a date to a string
- Fixed an issue with the PDQ (HL7v2 binding) interface not returning the phone number field in the PID segment when a phone type field has not been populated
- Fixed an issue with the REST API where deleting a record by id was not working under a more recent version of the graph database
- Added support to the service layer for retrieving a record along with all its inactive (voided) identifiers
- Improved the performance of the REST API for retrieving a specific link by its two endpoints by making it possible for the optimizer to use existing indexes
- Added support for logging to the REST APIs so that it can be optionally turned-on on the server to enable debugging of issues with calls to the interface
- Added support to the String Comparator Resource of the REST API for parameters to allow users to evaluate the similarity between strings for similarity metrics that require parameters to be passed down in the request.
- Enhanced the Apache Artemis implementation of the notification service to support the propagation of identifier update notifications and use the latest stable version of the messaging service
- Improved the performance of the process that generates candidate record pairs to be evaluated by the matching algorithm to reduce the number of record pairs that are generated. This modification can cause a considerable performance improvement to the processing of add and update requests
- Modified the Record Link resource to return the internal record id for each link to make easier and more efficient to retrieve detailed information about each link.
- Fixed the request for logged links to filter links returned by the entity specified in the request.
- Fixed a bug in the processing of the findByMatching request where certain characters in the key-value pairs passed as parameters were causing an exception.
- Fixed a bug with the asynchronous persistence of logged links to resolve a concurrent modification exception that would arise.
- Enhanced the performance of the File Import service so that when loading millions of records does not require a proportional amount of heap memory anymore.
- Enhanced the processing of background jobs so that only one job is processed at a time and jobs are processed in the order in which they are created. This will prevent cases where multiple data-intensive jobs were being processed concurrently and bringing the server to its knees.
- Added a new REST Resource to expose the string comparator service. This allows users to test various similarity metrics and thresholds so that they can identify the ideal parameters for their instance.
- Added a new REST interface to support loading data from a file without requiring that the records are imported but with matching activated
- Fixed the generation of blocks to skip generating blocks for a blockingKeyValue that is all null.
- Disable the "Reevaluate task on the Matching page since it was causing confusion among some users
- The import schema utility was not properly handling the synchronous vs asynchronous parameter in the serialized schema that was being imported
- Improved the handling of background jobs so that if the server stops before a job in the queue is done, any job that is left in "Processing state" will be rescheduled upon startup
- String Comparison REST Resource should return No Data found HTTP code when there are no parameters for a given metric instead of an empty list
- Uploading a file while the REST interface is actively in use by a different user than the user that is currently logged into the UI would on occasion set the owner of the file to be the API user due to a race condition
- Increased the default maximum connection pool size for the relational database so that for most deployments the users don't have to manually fine tune it
- Added an attribute to the file loader mapping file that allows the user to specify the number of columns in the file in the Flexible File Loader. This is useful when the data file has lots of optional fields especially in the last columns of the file
- Upgraded the release to embed Apache Tomcat 8.5.x
- Saving records with invalid date values into a database field of date or timestamp type was being rejected by OrientDB causing the record to fail to be imported; the file loader will now detect and clear such values to allow the records to be imported into the database
- Added support for user authentication for OpenEMPI through LDAP instead of the default mechanism. You can read more about this feature and how to configure it here.
- Added support for synonyms in matching, which are lists of two or more word that should be considered by the matching algorithm to be identical (Robert and Bob). You can read more about this feature and how to configure it here.
- Modify the importFile RESTful web service endpoint to expose more file import features and to make it asynchronous
- Modify the persistence of logged links during classification by the probabilistic matching algorithm so that they commit for each batch instead of for the entire operation, making the operation more efficient and scalable for sites with 10s of millions of records
- Added support for communication with the LDAP server using StartTLS
- Added support for a new feature in the configuration of the probabilistic matching algorithm that memorizes manually classified probable links so that such record pairs don't return to the review queue in the future. You can read more about this feature here.
- Add RESTful web service resource to manage synonyms
- Fixed a bug where overriding the probabilistic matching algorithm for a specific vector pattern to not match such record pairs, was causing record pairs with that particular pattern to generate an exception
- Add a new feature to the matching layer that allows the system to match two records in the case where the values for two fields have been transposed. For example, it is fairly common for the values of the first and last name to be transposed for a given record making it difficult to match the two records together
- Add a configuration integration framework to OpenEMPI that now allows for data to flow into and out of OpenEMPI using a wide variety of data sources and sinks. For example, you can setup an instance of OpenEMPI to periodically load data from a database by issuing a query to retrieve the records
- Added a new similarity metric that calculates the similarity between two numeric values such that numeric values that are closer together get a higher value
- Added a new similarity metric that calculates the similarity between two dates such that dates that are closer together get a higher value
- Modified the administrative application to immediately incorporate changes to custom fields so that there is no need to restart the service in order to proceed with subsequent configuration steps
- Added a new web services endpoint that returns records that are similar to the record presented by the caller along with a weight indicating the relative similarity between the two records
- Added the ability for a site to include a site-specific disclaimer message in the web application before the user is permitted to login
- Fixed an issue where the flexible file loader would load records without setting a date for the date created field
- Fixed the export process so that it is able to export records with identifiers that have no date created value
- Fixed the review links page to allow for sorting by weight or date created to assist the users in locating the specific record pairs to be resolved first
- Fixed the logging of the probabilistic algorithm for record links during the evaluation process to reduce the log level
- Added support for stronger password encryption in the professional edition of OpenEMPI that utilizes the latest encoding algorithms
- The instance can now be configured to hide encoded passwords from log files to allow for easier sharing of those files
- A new service is now included that can be configured to periodically delete old records from the audit log.
- Upgraded the connection pool for the relational database to the Hikari pool that provides much better performance under heavy loads
- Made it easier to configure the sampling rate of record pairs that are used during the training phase of the probabilistic matching algorithm
- Fixed an issue where exceptions are generated during shutdown when the instance has not been configured properly
- Added support for the identifier domain transformed which can now be used in two different transformation modes
- Added a new report that provides a summary of all the records that have been classified as a match during a period of time
- Added full support for 2-way TLS encryption to the HL7 v2 service interface
- Upgraded the embedded graph database OrientDB to the latest stable release
- Developed better isolation between the notification service and the specific implementation to make it easier to support other messaging brokers in the future
- Developed better integration between the PIX/PDQ service and the rest of the application to reduce the number of configuration files
- Utilized a new interface in the embedded OrientDB database for smoothly shutting down the database
- Made some of the operations that process record links configurable to better support sites with 10s of millions of links
- Added support for both JSON and XML messages to the person-based REST API that didn't previously support JSON
- Fixed the blocking service to load the latest configuration without requiring the server after configuration changes
- Fixed a few minor issues with the single best record module
- Added support for transitive closure of record pairs to resolve the issue with conflicting links being created with complex matching rule configurations
- Added support for the findOrAdd service method to the RESTful web services interface which adds a record only if a matching one is not found in the system
- Added a configuration parameter to support sites with tens of millions of links which need to be able to configure a larger block size when assigning global identifiers
- Added a new transformation function which in forming a custom field extracts a value using a regular expression
- Added support for the findByMatching and findByBlocking service methods through the RESTful web services interface
- Added support for custom fields in thefindByMatchingand findByBlocking service methods
- Fixed the deterministic matching algorithm to allow it to use distance metric parameters where present for certain metrics
- Fixed the support of JSON as the input and output data format for some of the Person REST interface methods that didn't support it
- Added support for auditing all events (including events to view a record) for HIPPA compliance
- Added an interface to the Entity REST API to support paging through all the records in an instance
- Added new transformation functions that can be used in the generation of custom fields
- Added support for SQLServer as the relational database supporting OpenEMPI
- Added support for MySQL as the relational database supporting OpenEMPI
- Added the report artifacts as part of the distribution of the commercial edition of OpenEMPI
- Improved the support for remote connections to the graph database when used in place of the default embedded mode
- Fixed the support for asynchronous matching of records
- Added a new global identifier generator module to support the requirements of a customer
- Improved the vector configuration screen of the probabilistic matching algorithm by showing available sample record pairs per vector
- Improved the performance of bulk import of data by eliminating the generation of notification events
- Upgrade and improved the integration with a messaging system switching to ActiveMQ Artemis as the default JMS server
- Performed technical refresh of underlying software such as the Spring framework, Hibernate, etc.
- A user account can be associated with a specified domain which enables filtering of review workload to associated domain
- Added support for fault-tolerance during operation of an instance of OpenEMPI through replication
- Added support in the commercial edition of OpenEMPI for collecting metrics about the operation and performance of the system
- Fixed an issue where generation of blocking key values generates huge blocks for single blocking field blocks with records that have blank values in the blocking field
- Fixed the substring transformation function so that it does not fail if the bounds specified fall outside the range of the field value affected
- Fixed a bug where an invalid field parameter used in the Entity REST API could cause a NullPointerException
- Fixed a bug where extending the entity schema of the default person entity would cause the Person REST API to fail on certain queries
- Fixed a bug where the export function of records from the system would be affected by the caching configuration of the export module
- Fixed a bug where the bulk import of records from another instance would cause the sequence generator to get out of sync
- Fixed a bug where generating a record link of match type wouldn't be persisted if a record link of probable link type already existed
- [OPENEMPI-373] - Added the ability for the user to clear a field value through the Entity REST API
- [OPENEMPI-372] - Bad data in a date field can cause blocking key value generation to fail and the blocking re-indexing process to end before processing every record
- [OPENEMPI-371] - The substring transformation function sometimes fails if the bounds specified fall outside the range of the field value affected
- [OPENEMPI-370] - In the entity REST API particular invalid values as field parameters can cause a NullPointerException
- [OPENEMPI-369] - Export process times out when the filesystem on which the database is stored is very slow
- [OPENEMPI-365] - Release of OrientDB incorporated in 3.3.6 broke some queries used as part of the report generation process
- [OPENEMPI-364] - Added auditing of records viewed by a user on the administrative console to the log; temporary solution until auditing of viewed records is implemented in a manner consistent with the auditing of other events
- [OPENEMPI-362] - Algorithm to find connected components is recursive and may run out of default stack space allocated in implementations where the system is not configured properly; algorithm was modified to operate in a non-recursive manner
- [OPENEMPI-361] - Before running the "link all records" process the operation to clear existing links is not working properly for all matching algorithms
- [OPENEMPI-357] - Exposed the method findPersonsById through the web service interface; it searches the repository by an identifier and returns all matching records
- [OPENEMPI-352] - Editing the definition of a custom field fails with duplicate field in entity message
- [OPENEMPI-351] - Sequence operations in OrientDB during concurrent updates can get out of synch and don't explicitly refresh their state with the database; caching causes the in-memory sequence to get out of synch with the database
- [OPENEMPI-350] - Formatting a date for comparison purposes uses a non-thread safe call which may fail under intense workloads
- [OPENEMPI-349] - When starting the database connection, the API currently used to establish initial connections uses default password instead of credentials provided
- [OPENEMPI-348] - OrientDB released a new version 2.2.17 with a fix that may affect OpenEMPI in operation currently using 2.2.16
- [OPENEMPI-347] Viewing linked records through the user interface's search results option, sometimes results in records that are not linked through a match link to appear in the list
- [OPENEMPI-346] Under some circumstances a record may end up having multiple global identifiers. In some scenarios involving more than two records that are linked together using both match and probable links, either through an update or a link operation, a record may end up having multiple global identifiers. It appears that in a recent update to OrientDB, the symantics of the API changed such that when an edge is deleted, the vertices associated with this edge do not end up with a null value for their incoming and outgoing edge but rather end up preserving their link association to a non-existent link.
- [OPENEMPI-345] Assigning global identifiers on an instance with hundred of thousands of links is slow. The process was modified to operate more efficiently in such environments.
- [OPENEMPI-343] The flexible file loader uses a parsing library that is missing a null field value when it appears as the last field value in a record.
- Upgraded the version of OrientDB to 2.2.16 since an issue was fixed in the 2.2.16 release that may impact the operation of OpenEMPI under certain circumstances.
- [OPENEMPI-342] Blocking service indexing when the blocking service is not configured well creates block for all null values of blocking keys. This slows down the matching process in cases where the blocking value is null for many records
- [OPENEMPI-341] Importing blocks of records sends notifications to interested parties and this slows down the process some with no real benefit. Disabling the generation of notifications helps improve the performance.
- [OPENEMPI-340] The exporter process uses the Avro library in such a way that it is caching data from previous records into new records for some fields causing invalid data to be imported.
- [OPENEMPI-336] After an import operation of records from a previous release, the record sequence is out of synch with the current record id.
- [OPENEMPI-337] Update of records that generates a link of different state from existing one causes the link not to be saved. For example if through the user interface a link is updated from a probable link to a match link, the new state of the link is not saved properly.
- [OPENEMPI-338] Update of blocking indexes in some cases doesn't properly create a new link block causing inefficiencies when records need to be evaluated for match status.
- Upgraded to a more recent version of the underlying graph database. This upgrade provides a number of performance improvements but made it necessary to modify how records from OpenEMPI are persisted at a low level. This change to the persistence of record data requires that a user migrates their data from the 3.2.0 release to the 3.3.0 release using the export/import tools that were developed.
- Upgraded to Apache Tomcat version 8.x from 7.x as the standard web application server for deploying OpenEMPI
- Developed a tool for exporting data from OpenEMPI and another one for importing data. This pair of tool performs the transformation that is needed for upgrading from the 3.2.0 release (or earlier) to the 3.1.0 release. This tool is available with the commercial edition of OpenEMPI.
- Extended the Web Services API to allow users to perform the workflow of importing data all the way through generating all links solely through web services calls. This feature allows customers that automate the process of linking data on a regular basis to perform the whole process programmatically without any manual intervention.
- Began the process of migrating the implementation of distance metrics to a different library that is more up-to-date and continues to be maintained. The process will be completed in the next release of OpenEMPI.
- [OPENEMPI-185] - Unlink person which hasviodedperson link cause exception
- [OPENEMPI-279] - Merge operation does not generate update notifications
- [OPENEMPI-293] - Metrics generated by the data profiler seem to be incorrect in some cases.
- [OPENEMPI-301] - The find duplicate feature from the record update screen is not working
- [OPENEMPI-319] - Sessions expiring are causing the PIX/PDQ to be restarted
- [OPENEMPI-320] - Adding a record with an email address fails due to validation error that is not caught and break the UI
- [OPENEMPI-321] - Sequence use is not working on an existing database
- [OPENEMPI-323] - Issue with handling of massive import declaration
- [OPENEMPI-325] - Unlinking record through the user interface doesn't properly update their global identifier
- [OPENEMPI-328] - Update before global identifiers have been assigned causes NPE
- Added Reporting Capabilities to the 3.2.0 release of the entity edition (Commercial Edition only). The users may now generate reports on the operation of the system. The reporting functionality was developed in an extensible manner so that new report types can be added over time. The current list of reports includes:
- Event Activity
- Data Profile Summary
- Duplicate Summary Statistics
- Potential Match Review Summary
- Potential Match Review Detail
- Added the ability to the user to be able to easily remove the global identifier assigned to all the records in the database. This feature is useful when first setting up an instance of OpenEMPI.
- Enhanced the probabilistic matching algorithm with a new feature we call null scoring which improves the matching performance of the algorithm in the presence of null values in matching fields.
- Added the ability to the user to be able to run the data profiling process against all records in the repository on demand instead of having to run it as a scheduled background process
- Enhanced the performance oflong runningoperations for sites with millions of records. Operations such as assigning global identifiers, rebuilding the indexes of the blocking algorithms and running the matching algorithm against all record pairsdoesnot take advantage of the multiple processing nodes available in a clustered deployment of OpenEMPI (Commercial edition only). Modify the implementation of these operations to take advantages of all the nodes available on the cluster.
- Added a new transformation function for custom field generation that changes the case of the associated field to have a certain case. A parameter of the transformation function specifies the case of the transformed field.
- Added sequencing of transformation functions to the custom field generation process.Thisallows the user to define a custom field that is generated based onanothercustom fields which implies that transformation functions can now be composed to form much more complex functions from simpler ones.
- Added the ability for the user to export all the records and links from the system. It is preferable that the export format is binary so that corruption of the file can be detected during subsequent loading of the file and ideally it should be easy to import the file for further processing in a cloud-based big data environment such as Hadoop.
- The process of assigning global identifiers to all records of an instance can take a long time on an instance that has millions of records.
- Modified the data profiler process when running against a file, to be able to specify the field delimiter along with the data types of the columns of the file instead of using the fixed field delimiter of the colon ':' character.
- The user should be able to invoke the re-indexing of the blocking fields at the specific entity level instead of having to run the process against all entities currently defined on an instance of OpenEMPI.
- IssueOPENEMPI-297: Under certain conditions, when trying to unlink two records from each other in the search result screen, the two records that are to be unlinked are not showing up side by side.
- IssueOPENEMPI-298: In the search screen after searching for a record andselecingto view the list of records linked to a selected record, the selected record itself was showing up in the list.
BugOPENEMPI-485 — Parsing the key value pairs in the findByMatching request causes an unexpected exceptionOPENEMPI-488 — Retrieve logged links does not filter by entityOPENEMPI-491 — The findByMatching call should only return match links and not probable links. Currently it returns both of them.OPENEMPI-492 — Persisting logged links asynchronously in chunks was causing concurrent modification exceptions in the ListOPENEMPI-495 — Retrieving record links returns the internal record id in an incomplete format which makes it difficult to retrieve the linkImprovementOPENEMPI-461 — Add logging of every web services requestOPENEMPI-482 — Add parameters to StringComparator Resource and pass them down to the similarity metrics that use themOPENEMPI-494 — Upgrade support for notifications over messagingOPENEMPI-496 — Improve the performance of the candidate selection process for record pairsTaskOPENEMPI-466 — Upgrade Confluence on the server to the latest version