Problem
I’m trying to create a function to check multiple conditions of data.
The function will return a list if the data is errors.
Here is my idea:
error_lists = []
def check(A, B, C, D, E):
check condition 1 for A
if there is a error -> error_list.append(1)
check condition 2 for B
if there is a error -> error_list.append(2)
check condition 3 for C
if there is a error -> error_list.append(3)
check condition 4 for D
if there is a error -> error_list.append(4)
check condition 5 for E
if there is a error -> error_list.append(5)
return error_list
If A, B, C do not meet the requirement, the function will return the error_list: [1,2,3]. Likewise, the function will return [2,5] if only B or E do not meet the requirement
Actually I’ve already written the code, however I feel that it need to be improved.
P_list = [35000, 50000, 80000, 120000, 150000, 180,000]
def data_check(Staff_id, A ,B, x, y, z, t, u, D):
error_list = []
wh = x + y
tot = z + t
C = [x, y, z, t, u]
"""First condition"""
if A not in P_list:
print(f"Error type 1 in the data of {staff_id} {A} is not True")
error_list.append(1)
"""Second condition"""
if B >= 48:
print(f"Error type 2 in the data of {staff_id} {B} is not True")
error_list.append(2)
"""Third condition"""
for k in C:
if k < 0:
print(f"Error type 3 in the data of {staff_id} {k} is less than 0")
error_list.append(3)
"""Fourth condition"""
if D >= 3000000:
print(f"Error type 4 in the data of {staff_id} D is not true")
error_list.append(4)
"""Fifth condition"""
if tot >= wh:
print(f"Error type 5 in the data of {staff_id} E is not true")
error_list.append(5)
return print(error_list)
data_check("1a88s2s", 50000 ,71, 23, 28, 3,5, 12, 3500000)
Does anybody has any suggestion ?
Many thanks
Solution
P_list
should be a set, not a list, based on its usage; it’s only used for membership checks. You can also pre-bind to the__contains__
member to get a function reference you can call, constraining the set to one purpose.- Add PEP484 type hints
C
should be a tuple based on its usage since it’s immutable- Your
print
statements should not be mixed in with your logic, and should be separated to another method - Consider representing your errors not as free integers, but as constrained enumeration values
- Use an
any
generator for yourC
negative check 180,000
is likely incorrect and should be180_000
return print
is meaningless and will always returnNone
; instead consider returning your error values or yielding them as a generatorstaff_id
shouldn’t be passed intodata_check
at all; it’s only used for formatting error messages which again can exist in a separate method- Your letter-salad variable names are helping no one. You need to spell out what these actually mean.
- Moving your intermediate error-checking variables
C
,wh
andtot
closer to where they’re used is, I think, clearer
Suggested
import enum
from enum import Enum
from typing import Iterable
in_p_list = frozenset((35_000, 50_000, 80_000, 120_000, 150_000, 180_000)).__contains__
@enum.unique
class ErrorType(Enum):
REJECTED_A = 1
B_TOO_HIGH = 2
NEGATIVE_C = 3
D_TOO_HIGH = 4
TOTAL_TOO_HIGH = 5
def data_check(
A: int,
B: int,
x: int,
y: int,
z: int,
t: int,
u: int,
D: int,
) -> Iterable[ErrorType]:
if not in_p_list(A):
yield ErrorType.REJECTED_A
if B >= 48:
yield ErrorType.B_TOO_HIGH
C = (x, y, z, t, u)
if any(k < 0 for k in C):
yield ErrorType.NEGATIVE_C
if D >= 3_000_000:
yield ErrorType.D_TOO_HIGH
wh = x + y
tot = z + t
if tot >= wh:
yield ErrorType.TOTAL_TOO_HIGH
def print_errors(staff_id: str, errors: Iterable[ErrorType]) -> None:
for error in errors:
print(
f'Error type {error.value} in the data of {staff_id}: {error.name}'
)
def test() -> None:
print_errors("1a88s2s", data_check(50000, 71, 23, 28, 3, 5, 12, 3500000))
if __name__ == '__main__':
test()
Output
Error type 2 in the data of 1a88s2s: B_TOO_HIGH
Error type 4 in the data of 1a88s2s: D_TOO_HIGH
Adhere to PEP8
Like keep the style of identifiers consistent. data_check, Staff_id, A ,B, x, y, z, t, u, D
– what style is common for all those names?
You can pass a list as an argument
No need to write so many arguments and gather them in a list – you can gather them when you call your function and pass only one c
argument. The call will look like
data_check(staff_id, A, B, [x, y, z, t, u], D)
Also you can gather all trailing arguments in a list with * (the call whouln’t change).
Use the loop
All the conditions are simple (one expression), all the messages are almost uniform. Thus, you can use the loop like this:
def data_check(staff_id, *args):
conditions = [lambda x: x not in P_list,
lambda x: x>=48, #are you trying to compare with '0'?
lambda x: any(i<0 for i in x),
...
]
result = []
for i, (condition, arg) in enumerate(zip(conditions, args), 1):
if not condition(arg):
result.append(i)
printf(f"Error type {i} in the data of {staff_id}: {arg} failed")
return result
You can change lambdas to local functions; this can make the code more readable (with correct names):
def data_check(staff_id, *args):
def not_in_p_list(x):
return x not in P_list
def greater_then_48(x):
return x>48
def any_less_then_0(x):
return any(i<0 for i in x)
conditions = [not_in_p_list,
greater_then_48,
any_less_then_0,
...
]
Of course, you can (and should) give conditions more informative names, like not_listed_in_registry
or negative_weight
.