1. Anonymize values
You said:
Statistical values of encrypted data also don't matter.
It means, that if there is public information about some facts contained in your database, e.g. what person has bought the most expensive car or house in what city, then some persons may be identified. Then based on your database further facts (not publicly known before) about these persons may be extracted from your database. But you said it is OK for you.
I would consider following methods:
A) Encryption (despite you say you are looking for other options besides encryption). Then the same values will be replaced with the same encrypted values. You wanted to do joins based on these data. This will be still possible also for the encrypted data. The well established methods like AES or ThreeFish are resistant against "known plain text" attacks. Thus even if somebody can identify a few persons based on statistical data, this will not help to restore the encryption key and to decrypt all the other data. One more advantage is, that a solution based on encryption needs relatively small secret.
B) Lookup tables (you said you dont't like such approach). But maintaining a lookup table may need permanent extension of the lookup table if you gen a new version of the data. Also the secret will be the whole lookup table, which is bigger that a normal key sufficient for reliable encryption (let say 256 bit key).
C) Other methods of data manipulation would actually mean a home grown encryption. In such case there will be no guarantee that all your requirements will be fulfilled. That's why I'd suggest not to consider any methods except of well established encryption algorithms or lookup tables.
2. Anonymize relations
You said:
Statistical values of encrypted data also don't matter.
But if you want to eliminate some statistical correlation, you can shuffle relations. Suppose you have a person table that refers addresses table using address IDs. Then you can take all address IDs, shuffle them and used the shuffled IDs in the person table. If you have a table with contact data like phone numbers, social network login names etc., you can shuffle references also there. Thus at any moment you will have a consistent data, all references will refer really existing data in other tables, but the combination of these values will not give any benefit to an attacker. For instance, one person living in Los Angeles will get an address in Monterey, and the neighbor of this person will get an address in New York. And they will get birthdays from some persons from Chicago and from Gettysburg respectively. Thus many relations between data will be broken.
Implementing such shuffling can need more efforts compared to encryption. For instance, if you use person IDs as references in 10 tables, then you would need to shuffle IDs in all these table using the same substitution table.
Also, depending on the logic of your application, some relations may need to be kept and should not be randomly shuffled. Only you can decide what manipulations are acceptable in your case.
3. Anonymize without encryption
In some cases encryption may be not needed at all. In case the fact that some person has any relation to your application is sensitive, e.g. if you maintain data about purchases of some weapon or about anonymous alcoholics, then you need some sort of encryption, see part 1 above. But if your application is a usual online shop and relation to it does not harm anyone, and if the number of entries is relatively big (not 3-5 presons, but say 100 000 persons), then encryption may be not needed at all. Just shuffle all the important relations: Relations between person name and address, person name and contact data, between orders and delivery addresses, etc. Thus every single piece of data will be real, unencrypted, but all together they will not give any correlation to real persons.
Performance isn't critical at all (my case - batch processing, it's performance on write which happens pretty infrequently). Statistical values remain - also doesn't matter, I need to preserve only consisncy and distinct properties for joins/deduplications. – VB_ Feb 09 '21 at 21:44