From: 强 马 (vera_wx_cn_at_yahoo_dot_com.cn)
Date: Tue Nov 25 2008 - 23:50:46 PST
Hello 聽 聽聽聽聽聽 聽聽聽 BLCR is wonderful! 聽聽聽聽 We have聽 developed a checkpoint/restart system for mvapich program based on BLCR. It's running on X86 cluster and being planted to IA64. So I fixed BLCR because it couldn't work on IA64. 聽聽聽聽 Now I have a trouble on IA64. Alougth my mvapich processes restared from checkpoint files successfully, Segmentation fault always happened after the processes restarted for a while. I check the core file by gdb, all the registers are zero, so no any stack information can be got. I guess it's memory fault. 聽聽聽聽 If I don't cancel the program after the checkpoints are finished and let it continue to run, it runs kindly until terminated normally. Otherwise, I cancel the program when checkpoints are finished, then restarted it from checkpoint files, I find the above segment fault. 聽聽聽聽 How to resolve this problem? Can you help me, and give me any tips? thanks you on advanced. 聽聽聽聽 ___________________________________________________________ 濂界帺璐哄崱绛変綘鍙戯紝閭璐哄崱鍏ㄦ柊涓婄嚎锛 http://card.mail.cn.yahoo.com/