472,352 Members | 1,547 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,352 software developers and data experts.

How to tell if a file is binary

I borrowed the following function from the PHP manual user notes:

[PHP]
if (!function_exists('is_binary')) {
/**
* Determine if a file is binary. Useful for doing file content
editing
*
* @access public
* @param mixed $link Complete path to file (/path/to/file)
* @return boolean
* @link http://us3.php.net/filesystem#30152
* @see link user notes regarding this created function
*/
function is_binary($link) {
$tmpStr = '';
$fp = @fopen($link, 'rb');
$tmpStr = @fread($fp, 256);
@fclose($fp);

if ($tmpStr) {
$tmpStr = str_replace(chr(10), '', $tmpStr);
$tmpStr = str_replace(chr(13), '', $tmpStr);

$tmpInt = 0;

for ($i = 0; $i < strlen($tmpStr); $i++) {
if (extension_loaded('ctype')) {
if(!ctype_print($tmpStr[$i])) $tmpInt++;
} elseif (!eregi("[[:print:]]+", $tmpStr[$i])) {
$tmpInt++;
}
}

if ($tmpInt > 5) return(0); else return(1);
} else {
return(0);
}
}
}
[/PHP]

Problem is that the results are completely backwards:

[PHP]
print_r(is_binary("/path/to/my/image.jpg")); // RETURNS 0
print_r(is_binary("/path/to/my/text.txt")); // RETURNS 1
[/PHP]

It's basically saying that binary files are ASCII and all ASCII files
are binary! Is there a better function out there that can tell me if a
file is binary or not?

Thanx
Phil

Dec 6 '05 #1
4 8484
You might try using file command on linux systems
--
Geeks Home
www.fahimzahid.com


"comp.lang.php" <ph**************@gmail.com> wrote in message
news:11**********************@g14g2000cwa.googlegr oups.com...
I borrowed the following function from the PHP manual user notes:

[PHP]
if (!function_exists('is_binary')) {
/**
* Determine if a file is binary. Useful for doing file content
editing
*
* @access public
* @param mixed $link Complete path to file (/path/to/file)
* @return boolean
* @link http://us3.php.net/filesystem#30152
* @see link user notes regarding this created function
*/
function is_binary($link) {
$tmpStr = '';
$fp = @fopen($link, 'rb');
$tmpStr = @fread($fp, 256);
@fclose($fp);

if ($tmpStr) {
$tmpStr = str_replace(chr(10), '', $tmpStr);
$tmpStr = str_replace(chr(13), '', $tmpStr);

$tmpInt = 0;

for ($i = 0; $i < strlen($tmpStr); $i++) {
if (extension_loaded('ctype')) {
if(!ctype_print($tmpStr[$i])) $tmpInt++;
} elseif (!eregi("[[:print:]]+", $tmpStr[$i])) {
$tmpInt++;
}
}

if ($tmpInt > 5) return(0); else return(1);
} else {
return(0);
}
}
}
[/PHP]

Problem is that the results are completely backwards:

[PHP]
print_r(is_binary("/path/to/my/image.jpg")); // RETURNS 0
print_r(is_binary("/path/to/my/text.txt")); // RETURNS 1
[/PHP]

It's basically saying that binary files are ASCII and all ASCII files
are binary! Is there a better function out there that can tell me if a
file is binary or not?

Thanx
Phil

Dec 6 '05 #2
* Determine if a file is binary. Useful for doing file content
editing Problem is that the results are completely backwards: It's basically saying that binary files are ASCII and all ASCII files
are binary! Is there a better function out there that can tell me if a
file is binary or not?


Here is another suggestion:
<http://groups.google.co.uk/group/comp.lang.php/browse_thread/thread/1d01eb12555a940d/cbf2065e8238ac45#cbf2065e8238ac45>

---
Steve

Dec 6 '05 #3
Thanx but that was not enough information for me.

1) Is the "diff tool" PHP, Linux, Windows, UNIX, Perl, ...??
2) Is there a comprehensive way of determining exactly how many bytes
you read into this diff tool (e.g. Windows: 1024? Linux: 256? FreeBSD:
512)

Thanx
Phil

Steve wrote:
* Determine if a file is binary. Useful for doing file content
editing

Problem is that the results are completely backwards:

It's basically saying that binary files are ASCII and all ASCII files
are binary! Is there a better function out there that can tell me if a
file is binary or not?


Here is another suggestion:
<http://groups.google.co.uk/group/comp.lang.php/browse_thread/thread/1d01eb12555a940d/cbf2065e8238ac45#cbf2065e8238ac45>

---
Steve


Dec 7 '05 #4
Thanx but that was not enough information for me. 1) Is the "diff tool" PHP, Linux, Windows, UNIX, Perl, ...??
2) Is there a comprehensive way of determining exactly how many bytes
you read into this diff tool (e.g. Windows: 1024? Linux: 256? FreeBSD:
512)


I wasn't suggesting you use diff itself, just create a test in PHP
using a similar algorithm. There's no definitive number of bytes to
test, as it is not an exact science. The more bytes you test, the more
reliable the result.

---
Steve

Dec 7 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Peter Abel | last post by:
Hi all, I'm working under W2k with Python 2.2.2 (#37, Oct 14 2002, 17:02:34) on win32 I have a file *test_data.txt* with the following content:...
13
by: yaipa | last post by:
What would be the common sense way of finding a binary pattern in a ..bin file, say some 200 bytes, and replacing it with an updated pattern of the...
3
by: DJTN | last post by:
I have created 2 vb.net applications in VS 2002, a server and a client using the .net.sockets namespace. I can connect and receive data fine but the...
8
by: siliconwafer | last post by:
Hi All, If I open a binary file in text mode and use text functions to read it then will I be reading numbers as characters or actual values?...
7
by: John Dann | last post by:
I'm trying to read some binary data from a file created by another program. I know the binary file format but can't change or control the format....
4
by: Florence | last post by:
How can a binary file be distinguished from a text file on Windows? Obviously I want a way that is more sophisicated that just looking at the dot...
10
by: chat | last post by:
Hi, I know that text file ended with EOF mark but there is no mark for binary file. So, the problem is how do we know the end of binary file is...
7
by: lawrence k | last post by:
I've got a music studio for a client. Their whole studio is run with Macintosh computers. Macintosh computers allow file names to have open white...
3
by: Magdoll | last post by:
I was trying to map various locations in a file to a dictionary. At first I read through the file using a for-loop, but tell() gave back weird...
0
better678
by: better678 | last post by:
Question: Discuss your understanding of the Java platform. Is the statement "Java is interpreted" correct? Answer: Java is an object-oriented...
1
by: Kemmylinns12 | last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and...
0
by: Naresh1 | last post by:
What is WebLogic Admin Training? WebLogic Admin Training is a specialized program designed to equip individuals with the skills and knowledge...
0
by: antdb | last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine In the overall architecture, a new "hyper-convergence" concept was...
0
by: Matthew3360 | last post by:
Hi there. I have been struggling to find out how to use a variable as my location in my header redirect function. Here is my code. ...
0
by: AndyPSV | last post by:
HOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and on my computerHOW CAN I CREATE AN AI with an .executable...
0
by: Arjunsri | last post by:
I have a Redshift database that I need to use as an import data source. I have configured the DSN connection using the server, port, database, and...
0
hi
by: WisdomUfot | last post by:
It's an interesting question you've got about how Gmail hides the HTTP referrer when a link in an email is clicked. While I don't have the specific...
0
by: Carina712 | last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand....

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.