搜索

Google
 

星期四, 六月 21, 2007

关于inotify

inotify是什么?用它能干些什么?这个问题我们还是首先从内核的文档开始吧--Documentation/filesystems/inotify.txt(说点题外话,内核文档虽然是没有任何格式的txt文档,给人的感觉却非常好,而且作者总是以最精炼的语言清楚地描述了相关的内容): a powerful yet simple file change notification system,通俗点说它是一个内核用于通知用户空间客户程序文件系统变化的系统,并且它是powerful yet simple的。

inotify的用户接口原型主要有以下3个:
初始化:int inotify_init(void);
添加监视对象:int inotify_add_watch(int fd, const char *path, uint32_t mask);
删除监视对象:int inotify_rm_watch(int fd, uint32_t wd);

内核文档里对于inotify_rm_watch的用户接口原型描述似乎有点问题,第2个参数写成了mask,实际上它应该是inotify_add_watch返回的watch descriptor。

根据文档描述可以看出,inotify使用大概分为以下几个步骤:
1、int fd = inotify_init(); 初始化inotify实例。
2、int wd = inotify_add_watch(fd, path, mask); 添加监视对象,这里的mask是一个或多个事件的位标记集合,具体的事件定义可参考(linux/inotify.h)或者(sys/inotify.h),前者为linux内核头文件,后者为glibc提供的头文件。
3、size_t len = read(fd, buf, BUF_LEN); 读取事件数据,buf应是一个指向inotify_event结构数组的指针。不过要注意的是inotify_event的name成员长度是可变的,这个问题后面再解释。
4、已经存在的监视对象可通过int ret = inotify_rm_watch(fd, wd);来删除。

下面我们来看一个示例:

#include <stdio.h>
#include <unistd.h>
#include <sys/select.h>
#include <errno.h>
#include <sys/inotify.h>

static void
_inotify_event_handler(struct inotify_event *event)
{
printf("event->mask: 0x%08x\n", event->mask);
printf("event->name: %s\n", event->name);
}

int
main(int argc, char **argv)
{
if (argc != 2) {
printf("Usage: %s <file/dir>\n", argv[0]);
return -1;
}

unsigned char buf[1024] = {0};
struct inotify_event *event = {0};
int fd = inotify_init();
int wd = inotify_add_watch(fd, argv[1], IN_ALL_EVENTS);

for (;;) {
fd_set fds;
FD_ZERO(&fds);
FD_SET(fd, &fds);
if (select(fd + 1, &fds, NULL, NULL, NULL) > 0) {
int len, index = 0;
while (((len = read(fd, &buf, sizeof(buf))) < 0) && (errno == EINTR));
while (index < len) {
event = (struct inotify_event *)(buf + index);
_inotify_event_handler(event);
index += sizeof(struct inotify_event) + event->len;
}
}
}

inotify_rm_watch(fd, wd);

return 0;
}


由以上代码可以看出inotify_init返回的file descriptor是可以用select或者poll进行I/O复用的。由于inotify_event长度是可变的,因此在读取inotify_event数组内容的时候需要动态计算下一个事件数据的偏移量(index += sizeof(struct inotify_event) + event->len),len成员即name成员的长度。

在实际测试过程中,通过运行以上的测试程序监视一个文件,还遇到过两个奇怪的现象:用vim编辑那个被监视的文件,修改并保存,触发的是IN_DELETE_SELF和IN_MOVE_SELF事件而不是我们所期望的IN_MODIFY事件;再次修改并保存的时候不再有任何事件发生。希望能给看官一个教训,其实这是由于vim的工作机制引起的,vim会先将源文件复制为另一个文件,然后在另一文件基础上编辑(一般后缀名为swp),保存的时候再将这个文件覆盖源文件,因此会出现上述的第一个现象,第二个现象是因为原来的文件已经被后来的新文件代替,因此监视对象所监视的文件已经不存在了,所以自然不会产生任何事件。

另外,内核文档第四部分介绍了inotify的背景以及设计思路,不可不看:

Q: What is the design decision behind not tying the watch to the open fd of
the watched object?

A: Watches are associated with an open inotify device, not an open file.
This solves the primary problem with dnotify: keeping the file open pins
the file and thus, worse, pins the mount. Dnotify is therefore infeasible
for use on a desktop system with removable media as the media cannot be
unmounted. Watching a file should not require that it be open.

Q: What is the design decision behind using an-fd-per-instance as opposed to
an fd-per-watch?

A: An fd-per-watch quickly consumes more file descriptors than are allowed,
more fd's than are feasible to manage, and more fd's than are optimally
select()-able. Yes, root can bump the per-process fd limit and yes, users
can use epoll, but requiring both is a silly and extraneous requirement.
A watch consumes less memory than an open file, separating the number
spaces is thus sensible. The current design is what user-space developers
want: Users initialize inotify, once, and add n watches, requiring but one
fd and no twiddling with fd limits. Initializing an inotify instance two
thousand times is silly. If we can implement user-space's preferences
cleanly--and we can, the idr layer makes stuff like this trivial--then we
should.

There are other good arguments. With a single fd, there is a single
item to block on, which is mapped to a single queue of events. The single
fd returns all watch events and also any potential out-of-band data. If
every fd was a separate watch,

- There would be no way to get event ordering. Events on file foo and
file bar would pop poll() on both fd's, but there would be no way to tell
which happened first. A single queue trivially gives you ordering. Such
ordering is crucial to existing applications such as Beagle. Imagine
"mv a b ; mv b a" events without ordering.

- We'd have to maintain n fd's and n internal queues with state,
versus just one. It is a lot messier in the kernel. A single, linear
queue is the data structure that makes sense.

- User-space developers prefer the current API. The Beagle guys, for
example, love it. Trust me, I asked. It is not a surprise: Who'd want
to manage and block on 1000 fd's via select?

- No way to get out of band data.

- 1024 is still too low. ;-)

When you talk about designing a file change notification system that
scales to 1000s of directories, juggling 1000s of fd's just does not seem
the right interface. It is too heavy.

Additionally, it _is_ possible to more than one instance and
juggle more than one queue and thus more than one associated fd. There
need not be a one-fd-per-process mapping; it is one-fd-per-queue and a
process can easily want more than one queue.

Q: Why the system call approach?

A: The poor user-space interface is the second biggest problem with dnotify.
Signals are a terrible, terrible interface for file notification. Or for
anything, for that matter. The ideal solution, from all perspectives, is a
file descriptor-based one that allows basic file I/O and poll/select.
Obtaining the fd and managing the watches could have been done either via a
device file or a family of new system calls. We decided to implement a
family of system calls because that is the preferred approach for new kernel
interfaces. The only real difference was whether we wanted to use open(2)
and ioctl(2) or a couple of new system calls. System calls beat ioctls.

星期日, 六月 10, 2007

rs422在linux跟我开了个玩笑

很久没有更新blog了,就跟抽烟一样,如果长时间不抽的话也就不想抽了,看来这是一个不好的现象,不过长时间的沉默也说明了自己这段时间在技术没有什么积累。罢了,今天的blog就记录这段时间来比较郁闷的事情。

这个项目的开发就像当年初中的长跑一样,前几圈劲头十足信心满怀,接下来的情况我想大部分人跟我一样,慢慢地感觉有点累,再后来压力越来越大,越来越感觉大脑缺氧呼吸困难,到后来已经身心疲累。人这一生从出生开始就背负了责任,要么为责任活着要么超越责任为理想活着,所以有人活得辛苦而无趣,有人轻松而超然。有点跑题了,还是说一下前几天遇到的技术问题。

客户方有一外部设备是N年前购买的,一直用于原来的系统中,这次需要在本项目中使用此外设,关于这个外设的文档早已缺失,唯一可以参考的就是一个windows下的动态链接库源代码(不过这也是最重要的参考资料,候大侠一本书上不是也说“源码面前了无秘密”嘛),于是参考此代码写了一个windows下的测试程序很顺利地测试通过,测试代码如下:

#include <windows.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int
main(int argc, char **argv)
{
HANDLE h_com;
unsigned char addr = 0x01;
unsigned char buf[10];
int i;
DCB dcb;
COMMTIMEOUTS timeouts;
DWORD n;

if (argc != 2) {
printf("%s <comdev>\n", argv[0]);
return -1;
}

h_com = CreateFile(argv[1], GENERIC_READ | GENERIC_WRITE,
0, NULL, OPEN_EXISTING, 0, NULL);
if (h_com == INVALID_HANDLE_VALUE) {
printf("Initialize COM device failed\n");
return -1;
}

GetCommState(h_com, &dcb);
dcb.BaudRate = 9600;
dcb.ByteSize = 8;
dcb.Parity = MARKPARITY;
dcb.StopBits = ONESTOPBIT;
SetCommState(h_com, &dcb);

GetCommTimeouts(h_com, &timeouts);
timeouts.ReadTotalTimeoutConstant = 10;
SetCommTimeouts(h_com, &timeouts);

WriteFile(h_com, &addr, sizeof(addr), &n, NULL);

Sleep(1);
dcb.Parity = SPACEPARITY;
SetCommState(h_com, &dcb);
memcpy(buf, "\x02\x85\x83\04", 4);
WriteFile(h_com, buf, 4, &n, NULL);

Sleep(2);
ReadFile(h_com, buf, sizeof(buf), &n, NULL);
printf("read %d bytes\n", n);

for (i = 0; i < n; ++i)
printf("%02x ", buf[i]);
printf("\n");

return 0;
}

由此代码可以看出通信过程大概是这样的:将串口设置为8M1发送地址,然后将串口设置更改为8S1发送请求数据,再然后接收数据,看起来还是很简单的。接下来很快地写好了linux下的测试程序,源代码如下:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/ioctl.h>
#include <fcntl.h>
#include <termios.h>

int
main(int argc, char **argv)
{
if (argc != 2) {
printf("Usage: %s <comdev>\n", argv[0]);
return -1;
}

int fd = open(argv[1], O_RDWR | O_NOCTTY | O_NDELAY);
if (fd == -1) {
printf("Initialize COM device failed.\n");
return -1;
}

struct termios options;
if (tcgetattr(fd, &options) == -1)
return -1;

cfsetispeed(&options, B9600);
cfsetospeed(&options, B9600);

options.c_cflag &= ~CSIZE;
options.c_cflag |= CS8 | CLOCAL | CREAD | CMSPAR | PARODD;
options.c_cflag &= CSTOPB;
options.c_iflag |= INPCK;
options.c_iflag &= ~(IXON | IXOFF | IXANY);
options.c_oflag &= ~OPOST;
options.c_lflag &= ~(ECHO | ECHOE | ICANON | ISIG);
tcflush(fd, TCIFLUSH);
printf("tcsetattr result: %d\n", tcsetattr(fd, TCSANOW, &options));

char addr = 0x01;
if (write(fd, &addr, 1) == -1)
goto failed;

usleep(1000);

options.c_cflag &= ~PARODD;
printf("tcsetattr result: %d\n", tcsetattr(fd, TCSADRAIN, &options));

usleep(1000);

unsigned char buf[10];
memcpy(buf, "\x02\x85\x83\x04", 4);

if (write(fd, buf, 4) == -1)
goto failed;

usleep(10000);

ssize_t read_n;
int i;
read_n = read(fd, buf, sizeof(buf));
printf("result: %d\n", read_n);
for (i = 0; i < read_n; ++i)
printf("%02X ", buf[i]);
printf("\n");

close(fd);
return 0;

failed:
close(fd);
return -1;
}

虽然Mark Parity和Space Parity在Posix中并未定义,但是在linux下通过CMSPAR仍然可以实现,tcsetattr的man pages可以看到对CMSPAR的描述(),可是反复尝试仍然有问题,后来一个完全不懂linux开发的同事问了我设置串口各行代码的意思,并通过tcsetattr的man pages,一个一个地组合这些标记,居然成功了!结果出人意料,且看如下代码:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/ioctl.h>
#include <fcntl.h>
#include <termios.h>

int
main(int argc, char **argv)
{
if (argc != 2) {
printf("Usage: %s <comdev>\n", argv[0]);
return -1;
}

int fd = open(argv[1], O_RDWR | O_NOCTTY | O_NDELAY);
if (fd == -1) {
printf("Initialize COM device failed.\n");
return -1;
}

struct termios options;

memset(&options, 0, sizeof(options));

cfsetispeed(&options, B9600);
cfsetospeed(&options, B9600);

options.c_cflag = CS8 | CLOCAL | CREAD | CSTOPB | CMSPAR;
options.c_iflag = INPCK;
options.c_oflag = 0;
options.c_lflag = 0;
tcflush(fd, TCIFLUSH);
printf("tcsetattr result: %d\n", tcsetattr(fd, TCSANOW, &options));

char addr = 0x01;
if (write(fd, &addr, 1) == -1)
goto failed;

usleep(1000);

options.c_cflag |= PARENB | PARODD;
printf("tcsetattr result: %d\n", tcsetattr(fd, TCSADRAIN, &options));

usleep(1000);

unsigned char buf[10];
memcpy(buf, "\x02\x85\x83\x04", 4);

if (write(fd, buf, 4) == -1)
goto failed;

usleep(10000);

ssize_t read_n;
int i;
read_n = read(fd, buf, sizeof(buf));
printf("result: %d\n", read_n);
for (i = 0; i < read_n; ++i)
printf("%02X ", buf[i]);
printf("\n");

close(fd);
return 0;

failed:
close(fd);
return -1;
}

想不到抛开CMSPAR不算8M1看起来是8N2,而8S1则看起来是8O2。。。