I'm reading book TCP/IP Illustrated, Vol.1 and there it says that IP fragmentation is done by network layer.
This is how I understood the concept so far: Network layer (L3) creates "IP datagram" (IP header + data) and sends that byte array to data link layer (L2). If L2 doesn't know about IP datagram structure, and received byte array is greater than maximal size, it wouldn't know how to split that byte array and append IP header to each chunk, as its L3's responsibility. So L2 and L3 have to cooperate somehow.
What confuses me is that few pages later, when discussing about IP header, total length and max size of IP datagram, it says: "Although it's possible to send a 65535-byte IP datagram, most link layers will fragment this".
Is my "concept" wrong? Does L2 actually know about IP datagram structure so it could fragment IP datagram? If that's true, why doesn't L2 always do the fragmentation since it knows its MTU?