My problem is, that I need to load a binary file and work with single bits from the file. After that I need to save it out as bytes of course.
My main problem is - what datatype to choose to work in - char or long int? Can I somehow work with chars?
My problem is, that I need to load a binary file and work with single bits from the file. After that I need to save it out as bytes of course.
My main problem is - what datatype to choose to work in - char or long int? Can I somehow work with chars?
Unless performance is mission-critical here, use whatever makes your code easiest to understand and maintain.
Before beginning to code any thing make sure you understand endianess, c++ type sizes, and how strange they might be.
The unsigned char is the only type that is a fixed size (natural byte of the machine, normally 8 bits). So if you design for portability that is a safe bet. But it isn't hard to just use the unsigned int or even a long long to speed up the process and use size_of to find out how many bits you are getting in each read, although the code gets more complex that way.
You should know that for true portability none of the internal types of c++ is fixed. An unsigned char might have 9 bits, and the int might be as small as in the range of 0 to 65535, as noted in this and this answer
Another alternative, as user1200129 suggests, is to use the boost integer library to reduce all these uncertainties. This is if you have boost available on your platform. Although if going for external libraries there are many serializing libraries to choose from.
But first and foremost before even start optimizing, make something simple that work. Then you can start profiling when you start experiencing timing issues.
It really just depends on what you are wanting to do, but I would say in general, the best speed will be to stick with the size of integers that your program is compiled in. So if you have a 32 bit program, then choose 32 bit integers, and if you have 64 bit, choose 64 bit.
This could be different if there are some bytes in your file, or if there are integers. Without knowing the exact structure of your file, it's difficult to determine what the optimal value is.
Your sentences are not really correct English, but as far as I can interpret the question you can beter use unsigned char (which is a byte) type to be able to modify each byte separately.
Edit: changed according to comment.
If you are dealing with bytes then the best way to do this is to use a size specific type.
#include <algorithm>
#include <iterator>
#include <cinttypes>
#include <vector>
#include <fstream>
int main()
{
std::vector<int8_t> file_data;
std::ifstream file("file_name", std::ios::binary);
//read
std::copy(std::istream_iterator<int8_t>(file),
std::istream_iterator<int8_t>(),
std::back_inserter(file_data));
//write
std::ofstream out("outfile");
std::copy(file_data.begin(), file_data.end(),
std::ostream_iterator<int8_t>(out));
}
EDIT fixed bug
If you need to enforce how many bits are in an integer type, you need to be using the <stdint.h> header. It is present in both C and C++. It defines type such as uint8_t (8-bit unsigned integer), which are guaranteed to resolve to the proper type on the platform. It also tells other programmers who read your code that the number of bits is important.
If you're worrying about performance, you might want to use the larger-than-8-bits types, such as uint32_t. However, when reading and writing files, you will need to pay attention to the endianess of your system. Notably, if you have a little-endian system (e.g. x86, most all ARM), then the 32-bit value 0x12345678 will be written to the file as the four bytes 0x78 0x56 0x34 0x12, while if you have a big-endian system (e.g. Sparc, PowerPC, Cell, some ARM, and the Internet), it will be written as 0x12 0x34 0x56 0x78. (same goes or reading). You can, of course, work with 8-bit types and avoid this issue entirely.