Well, though many people have requested this, it would be almost impossible to code without ENORMOUS effort!
The problem is (and you probably know all this but I just finished my psych assignment so I'm warned up with typing hahahahaha) standard CDs, wave files, mp3 files, etc. all store digital samples of sound waves. A common sample rate is 44,100 Hz, which is 44,100 samples per second (a fair few).
The human ear picks each one of those samples up and our brains make it into a nice little sound so we can 'hear' it as something.
After all the human brain cannot percieve unless is first converts the input into something we can understand.
Here is a wave at 1:42 (zoomed out heaps)
[IMAGE]
Each pixel represents 42 samples.
Here is a portion of that same wave at 1:1 (no zoom)
[IMAGE]
Each pixel represents 1 sample.
Now, the problem is that from those above images, can you tell what is a drum beat? What is a guitar strum? What is a piano key?
And how about a voice?
Different tracker formats are:
ProTracker (mod)
ScreamTracker (S3M)
FastTracker (XM)
Impulse Tracker (IT)
They store small samples of an instrument playing and then pay them back at different pitches.
So now how do you get the correct sounding instrument to play? And there might be many playing at once.
So imagine having to sift through those above waves looking for not one, but maybe 32 individual instruments and a voice or 22 to boot.
One good thing about waves is that they are all made up of simpler waves. But there can be many of these waves in each instrument.
That's the good thing about our brains :)
We can do stuff computers cannot dream of... perhaps that's because they can't dream, or maybe they just lack humanity ;)
Short answer:
No :)
Other short answer:
If you give me an mp3, I can give you an xm or it file in return.
However the file will be very large... As large as the same file in wave format (not mp3).
So... back to the first short answer :P