Home python Sort list of lines of lines in different lexicographic order

# Sort list of lines of lines in different lexicographic order

Author

Date

Category

In the recently specified it is a question I went about the reverse sorting of alphabetical order.
In this case, the question is solved simply – by the numbers initially sort the opposite, and then turn the result in the reverse order. However, it is likely that it is not always possible to solve the issue.

It seemed to me that this particular discussion was already little related to the topic of the issue, and it is worth highlighting it in a separate.

Essence of the question:
There is some dataset:

``````data = [
('Drifted', 1),
('Greek', 3),
('In 1),
('River', 2),
('Sees', 1),
('IN 2),
('River', 1),
('Cancer', 2),
('Sunun', 1),
('Hand', 2),
('For 1),
('Greek', 1),
('DAC', 1)
]
``````

It is worth the task: first sort this list on the second element (number) in descending order, and in the case of a match, inside the group, sort in reverse lexicographic order.

Solving the problem of sorting in direct lexicographic order is reduced to one line:

``````sorted (data, key = lambda x: (-x , x ))
``````

Specifically, this task can also be solved in one line:

``````reversed (Sorted (Data, Key = Lambda X: (x , x )))
# or
Sorted (Data, Key = Lambda X: (x , x ), Reverse = True)
``````

And this solution will really give the correct result. However, you can not always do with a simple list turn (especially if the sort keys are more than two, or both keys – strings)

Reverse sorting by number is set using unary denial: `-key `
Is there any possibility for lines?

# UPD

Answer to a question from comments about `Reverse = True `
In some cases, the use of this attribute does not help, smooth account, nothing.

Input:

``````data_hard = [
('Ivanov', 'Sergey', 'Fedorovich'),
('Ivanov', 'Alexey', 'Evgenievich'),
('Afanasyev', 'Artyom', 'Grigorievich'),
('Ivanov Ivan Ivanovich'),
('Petrov', 'Sergey', 'Viktorovich'),
('Pushkin, Alexander Sergeyevich'),
('Ivanov', 'Sergey', 'Petrovich'),
('Gogol', 'Nikolai', 'Vasilyevich'),
]
``````

Sorry the list in direct order by last name. In case of matching the surname, sort the group in the reverse order by name. In the case of a coincidence of the surname and the name, sort the subgroup in the reverse order by patronymic.

Result of direct sorting:

``````& gt; & gt; & gt; Sorted (Data_Hard, Key = Lambda X: (x , x , x ))
[('Afanasyev', 'Artyom', 'Grigorievich'),
('Gogol', 'Nikolai', 'Vasilyevich'),
('Ivanov', 'Alexey', 'Evgenievich'),
('Ivanov Ivan Ivanovich'),
('Ivanov', 'Sergey', 'Petrovich'),
('Ivanov', 'Sergey', 'Fedorovich'),
('Petrov', 'Sergey', 'Viktorovich'),
('Pushkin, Alexander Sergeyevich')]
``````

Reverse Sort Result:

``````& gt; & gt; & gt; Sorted (Data_Hard, Key = Lambda X: (x , x , x ), Reverse = True)
[('Pushkin, Alexander Sergeyevich'),
('Petrov', 'Sergey', 'Viktorovich'),
('Ivanov', 'Sergey', 'Fedorovich'),
('Ivanov', 'Sergey', 'Petrovich'),
('Ivanov Ivan Ivanovich'),
('Ivanov', 'Alexey', 'Evgenievich'),
('Gogol', 'Nikolai', 'Vasilyevich'),
('Afanasyev', 'Artyom', 'Grigorievich')]
``````

As can be seen, both options do not correspond to the conditions. Those. Attribute `Reverse `is not always a relevant solution for reverse string sorting.

In challenging cases, it is necessary to sort into several passages.
Analog

``````sorted (data, key = lambda x: (-x , x ))
``````

Working not only with numbers, will be

``````sorted (sorted (data, key = lambda x: x ), key = lambda x: x [ 1], Reverse = True)
``````

``````from operator import itemgetter
Sorted (Sorted (Data, Key = Itemgetter (0)), Key = ItemGetter (1), Reverse = True)
``````

### UPD

Explanation:

``````[('Ivanov', 'Sergey', 'Fedorovich'),
('Ivanov', 'Alexey', 'Evgenievich'),
('Afanasyev', 'Artyom', 'Grigorievich'),
('Ivanov Ivan Ivanovich'),
('Petrov', 'Sergey', 'Viktorovich'),
('Pushkin, Alexander Sergeyevich'),
('Ivanov', 'Sergey', 'Petrovich'),
('Gogol', 'Nikolai', 'Vasilyevich')]
``````

and order it by name

``````[('Petrov', 'Alexander', 'Vladimirovich'),
('Pushkin, Alexander Sergeyevich'),
('Ivanov', 'Alexey', 'Evgenievich'),
('Afanasyev', 'Artyom', 'Grigorievich'),
('Ivanov Ivan Ivanovich'),
('Gogol', 'Nikolai', 'Vasilyevich'),
('Ivanov', 'Sergey', 'Fedorovich'),
('Petrov', 'Sergey', 'Viktorovich'),
('Ivanov', 'Sergey', 'Petrovich')]
``````

Note that entries with the same keys (names in this case) remained in the same order relative to each other, for example, Ivanov Sergey Fedorovich stands before Petrova Sergey Vovich. This is not a coincidence, such behavior is guaranteed.

Now the last name in the reverse order

``````data_hard.sort (key = itemGetter (0), Reverse = True)
``````
``````[(Pushkin ',' Alexander ',' Sergeevich '),
('Petrov', 'Sergey', 'Viktorovich'),
('Ivanov', 'Alexey', 'Evgenievich'),
('Ivanov Ivan Ivanovich'),
('Ivanov', 'Sergey', 'Fedorovich'),
('Ivanov', 'Sergey', 'Petrovich'),
('Gogol', 'Nikolai', 'Vasilyevich'),
('Afanasyev', 'Artyom', 'Grigorievich')]
``````

And, again, Petrov Alexander Vladimirovich goes before Petrova Sergei Viktorovich, as in the previous table. This means that the results of the previous sorting did not disappear. And despite the fact that many records changed order, because the order of surnames is more important than the procedure for names, those records whose order did not need to be changed in their places.

This is possible due to the fact that the sorting algorithm used in `Sort `and `Sorted `guaranteed Sustainable .

Another (less productive) solution will be the creation of your own-class wrap, changing operands comparison in places.

``````from functools import total_ording
@Total_Ordering
Class Reverse:
Def __init __ (Self, Value):
Self.Value = Value.
DEF __LT __ (Self, Other):
Return Other.Value & lt; Self.Value.
Def __eq __ (Self, Other):
Return Other.Value == Self.Value
``````
``````sorted (data, key = lambda x: (Reverse (x ), x ))
``````

### UPD

Explanation:

`Reverse `It plays the same role as minus for numerical, i.e. is some function `F `such that

If `X & LT; Y `, then `f (y) & lt; F (X) `
If `x = y `, then `f (y) = f (x) `

And even though this method is slower than the previous one (it uses more calculations requiring interpretation), it still more superficial than the transformation of the string into an array of negative numbers, and it is guaranteed correctly working with any types of correctly implementing comparison.

Since there is no denial for rows for rows, we can resort to tricks: translate a string to an array of numbers, where each number is the sequence number of the letter in this word:

``````sorted (data, key = lambda x: (-X , [-o for o in map ( ORD, X )]))
``````

If more clearly, the word numeric representation was driving:

``````& gt; & gt; & gt; List (MAP (ORD, DATA  ))
[1045, 1093, 1072, 1083]
``````

Accordingly, the sorting occurs in the reverse order. All that remains to do is deploy all numbers in sequences:

``````& gt; & gt; & gt; [-o for o in map (ORD, X )]
[-1045, -1093, -1072, -1083]
``````

In this case, the sorting will occur in the return lexicographic order.
Result of work:

``````& gt; & gt; & gt; Sorted (Data, Key = Lambda X: (-X , [-O for o in Map (ORD, X )]))
[('Greek', 3),
('Hand', 2),
('River', 2),
('Cancer', 2),
('IN 2),
('In 1),
('DAC', 1),
('Sunun', 1),
('River', 1),
('For 1),
('Drifted', 1),
('Greek', 1),
('Sees', 1)]
``````

As it was stated – the list is sorted in descending order by the number, and inside groups with the same number – in reverse lexicographic order.

# UPD

Result Sorting Data from Supplement:

``````& gt; & gt; & gt; Sorted (
... Data_Hard,
... Key = lambda x:
... (x , [-O for o in Map (ORD, X )], [-O for o in map (ORD, X )])
...)
[('Afanasyev', 'Artyom', 'Grigorievich'),
('Gogol', 'Nikolai', 'Vasilyevich'),
('Ivanov', 'Sergey', 'Fedorovich'),
('Ivanov', 'Sergey', 'Petrovich'),
('Ivanov Ivan Ivanovich'),
('Ivanov', 'Alexey', 'Evgenievich'),
('Petrov', 'Sergey', 'Viktorovich'),
('Pushkin, Alexander Sergeyevich')]
``````

Such a key set method is quite suitable for solving such tasks

Your decision can be reduced slightly:

``````in : sorted (
...: Data_Hard,
...: Key = lambda x: ([-ord (c) for c in x ], x , x ),
...: Reverse = True
... :)
Out :
[('Afanasyev', 'Artyom', 'Grigorievich'),
('Gogol', 'Nikolai', 'Vasilyevich'),
('Ivanov', 'Sergey', 'Fedorovich'),
('Ivanov', 'Sergey', 'Petrovich'),
('Ivanov Ivan Ivanovich'),
('Ivanov', 'Alexey', 'Evgenievich'),
('Petrov', 'Sergey', 'Viktorovich'),
('Pushkin, Alexander Sergeyevich')]
``````

In addition, you can use Heavy artillery Pandas:

``````import pandas as pd
DF = PD.DataFrame (Data_Hard, Columns = ["Last_Name", "first_name", "mid_name"])
``````

``````in : df.sort_values ​​(df.columns.to_list (), ascending = [TRUE, FALSE, False])
Out :
Last_Name First_Name Mid_Name.
3 Afanasyev Artyom Grigorievich
9 Gogol Nikolai Vasilyevich
0 Ivanov Sergey Fedorovich
8 Ivanov Sergey Petrovich
4 Ivanov Ivan Ivanovich
2 Ivanov Alexey Evgenievich
5 Petrov Sergey Viktorovich
7 Pushkin Alexander Sergeevich
``````

In the form of a list of lists:

``````in : df.sort_values ​​(df.columns.to_list (), ascending = [true, false, False]). To_numpy (). TOLIST ()
Out :
[['Afanasyev', 'Artyom', 'Grigorievich],
['Gogol', 'Nikolai', 'Vasilyevich'],
['Ivanov', 'Sergey', 'Fedorovich'],
['Ivanov', 'Sergey', 'Petrovich'],
['Ivanov Ivan Ivanovich'],
['Ivanov', 'Alexey', 'Evgenievich'],
['Petrov', 'Sergey', 'Viktorovich'],
['Pushkin, Alexander Sergeyevich']]
``````

Update: Comparison of different speed options on the list of 10,000 lists:

``````in : data_big = data * 1000
In : Len (Data_Big)
Out : 10000
In : %% timeit
...: sorted (
...: data_big,
...: Key = lambda x:
...: (x , [-o for o in Map (ORD, X )], [-O for o in map (ORD, X )])
... :)
...:
...:
53.8 MS ± 220 μS PER LOOP (Mean ± Std. Dev. Of 7 Runs, 10 Loops Each)
In : %% timeit
...: Sorted (Sorted (Data_Big, Key = ItemGetter (0)), Key = ItemGetter (1), Reverse = True)
...:
...:
3.66 MS ± 504 NS PER LOOP (Mean ± Std. Dev. Of 7 Runs, 100 Loops Each)
In : %% timeit
...: sorted (
...: data_big,
...: Key = lambda x:
...: (x , [-o for o in Map (ORD, X )], [-O for o in map (ORD, X )])
... :)
...:
...:
54.4 MS ± 797 μs PER LOOP (Mean ± Std. Dev. Of 7 Runs, 10 Loops Each)
In : %% timeit
...: df = pd.dataframe (data_big, columns = ["last_name", "first_name", "mid_name"])
...: df.sort_values ​​(df.columns.to_list (), Ascending = [TRUE, FALSE, FALSE]). To_numpy (). TOLIST ()
...:
...:
8.91 MS ± 2.86 μs PER LOOP (Mean ± Std. Dev. Of 7 Runs, 100 Loops Each)
``````

Why spend time searching for the correct question and then entering your answer when you can find it in a second? That's what CompuTicket is all about! Here you'll find thousands of questions and answers from hundreds of computer languages.