City trip search, principle of operation and advantages
Goals of the program and of the current text
The goal of the "city trip search" search engine is to show every interested person the advantages and principle of operation of a search engine sorting items by k-Pareto optimality. Therefore the web interface gives the user the opportunity to interact in a very direct way with the server. For real world usage the web interface can be simplified and customised. Nevertheless the interface is simple to use and the main goal of this text is to use the search engine and its interface as an example in order to explain explain the methodology of k-Pareto optimality based sorting.
For data availability reasons the locations in Western Europe available available on the websites geonames.org and wikipedia.org have been loaded into the search engine. Other uses of the search engine might however be more important or profitable from a financial point of view:
- websites for classifieds or job offers.
- decision support tools for investors who wish to diversify their stock portfolio while respecting various preferences (sector, profitability of the enterprise, age of the enterprise, value of the trademarks, ...).
Use case
We assume a person uses the search engine for finding a location to spend his holidays.
The upper selection boxes found under the heading "Select a subset of cities" offers the possibility to select a subset of cities. This part of the search engine works in a straightforward way. The number of retained locations is displayed above the results.
The lower selection boxes found under the heading "sort by preferences" offer the user the possibility to sort the obtained subset of locations according to his preferences. The 50 first locations of this preference based sort are finally displayed. Basically this preference based sort works as follows.
The preference based sort
The part "Sort the subset by preferences" of the interface offers the user the opportunity to encode his preferences as a composite preorder relation. He might prefer a location to another one if it is closer to Paris, has a smaller population and contains a castle.
The sub-relation "has a smaller population" might also be replaced by "has a population between 500 and 1000 inhabitants". The meaning of this kind of sub-relation "belongs to an interval" is the following.
- If two cities are both in the interval they are considered to be equally preferable,
- otherwise the one that is closer to the interval is considered to be preferable.

There are many ways to sort results while respecting a user's preferences. For example from this point of view lexicographic sorting and sorting by weighted means of differences to the desired values are both satisfactory.
This search engine respects the user's preferences by sorting according to an innovative criterion called k-Pareto optimality. The k-Pareto optimality of a location is the percentage of locations that are preferable to that location. It is the product of the percentages of locations that are preferable for each sub-criterion. Instead of computing products one takes logarithms which add-up. These additive sub-scores are called self-information and are visualised by the length of horizontal blue bars.

It is insightful to select two or three intervals, to change their lengths and to observe the results and scores.
Advantages of the k-Pareto optimality based sort
Because of its interesting mathematical properties the k-Pareto optimality based sorting has many advantages for a user.
- The rarer a characteristic is, the more it is valued. This is consistent with economic common sense.
- It is transparent. The user can see sub-scores and know why some object has been placed before some other object.
- No arbitrary parameters are intervening in the computations and no arbitrary adjustments are required.
- It offers maximum choice. It is the kind of sorting that offers a user most choice while respecting his preferences. Given a user is only looking for one location to spend his holiday, pairs of locations where one is by all search criteria strictly preferable to the other other are to avoid in the results. Indeed the user is likely to discard the strictly worse alternative. By avoiding best such redundant pairs k-Pareto Optimality based sorting makes it likelier the user finds a location that suits his personal needs best.
- It can be implemented in an efficient way
All those properties have been proved in a mathematical text and published at the AISTATS 2022 conference.
The above methodology is best adapted to cases where at least three search criteria need to be taken into account. For example three independent search criteria where each one filters out 9 results out of 10 together filter out approximately 999 results out of 1000. When therefore there are no-or no more results that fit all criteria exactly, k-Pareto optimality based sorting falls back as well to items that fit two criteria very well and one not at all, as to results that fit all three criteria quite well but not very well. In this way k-Pareto optimality enables the user to choose the compromise that seems best to him.
Indeed n numerical criteria make-up an n-dimensional space. From a geometric point of view we select in this space regions delimited by hyperboloid-like surfaces of constant k-Pareto optimality. Weighted means or Euclidean distances delimit planes or spheres and thus tend to only keep results that are an average fit for all criteria.
The above methodology applies well to any decision process that is performed in two steps. The first step is a preselection of a limited number of items out of a large collection according to the above methodology. The second step consists of the final selection of a unique item out of the preselected ones. This step might be performed by an expert via a complex and costly analysis or via an arbitrary person selecting according to his very own tastes.