Why can SHA-1 be considered a secure hash function? That's something I still wonder about.
I understand the concepts of why modern asymmetric algorithms are deemed to be secure. They are founded on sound mathematical problems that are provably "hard" to solve, e.g. discrete logarithms in finite fields or integer factorization. The concepts of security claims and proofs are relatively easy to follow if one is aware of the mathematical concepts.
But when it comes to symmetric cryptography and secure hash functions the picture becomes much less clear. I understand that there exist a lot of results and analysis for block ciphers and digest algorithms, but what are these results founded on?
E.g. when it comes to block ciphers you can find a lot of proofs that cipher algorithm X is resistant against a certain number of known attacks. Or they prove that some property holds, e.g. every bit of the input affects the output, because this is deemed necessary etc. etc.
From the outside, the construction of cipher and digest algorithms looks like "trying to fiddle and mess with the input as much as possible" by applying bit shifts, XORs and so on.
What I would like to know now (I would be grateful for deeper insight in either):
a) Could you provide me with pointers to resources (books preferred) that explain the design and security considerations one has to take into account when constructing a
a1) cipher algorithm
a2) digest algorithm
that would explain things such as why an S-box has to look exactly the way it does instead of any other way and probably even more important for me and my understanding why it would be bad if it were constructed differently?
b) Does there exist or are their attempts for modelling these "bit fiddling operations" mathemtically(e.g. are "algebraic attacks" based on such a model?) ?
c) How do you "measure" the quality of a digest algorithm such as SHA-1? I.e. how can you say it's better to do a shift by two bits here instead of three or an XOR, and why are these operations the basis of SHA-1 in the first place? Because at the time it seemed like the only known thing that would "maximally mess" with the input? I'm asking because it seems as if most SHA-3 candidates were either based on cipher algorithms (because there are more theoretical results) or e.g. on new concepts such as sponge functions. To me the definitions of any of the SHA algorithms (MD5 too) still look like "Let's mess with this, shall, we?" - but what's the reasoning behind it? Why do it the way they did?
I'd be more than happy if you could give me insight into any of these topics.