Re: checkpointing across different cpu's

jerry.mersel_at_weizmann.ac.il
Date: Fri Apr 30 2010 - 00:45:03 PDT


Thank you for your response.
It is probably the last item you mentioned since I
turned off prelinking.

Thank you.


With Blessings
and Best regards,

    Jerry
      2363

You shall do no unrighteousness in judgment; you shall not favor the poor,
nor favor the mighty; but in righteousness you shall judge your neighbor.
(Torah portion, Kedoshim, Leviticus 19:15)

> Jerry,
>
> I don't know an exact or simple answer to your question, but I can list
> some things that I *know* would prevent migration between machines:
>
> + No migration between 32- and 64-bit CPUs even if the process was 32-bit.
> + No migration between kernels that are "too different", where I have
> not good definition.
> + Will SEGV if pre-linking places shared libs at different addresses (we
> have a FAQ entry for this)
> + There are 2 or more different FPU state save/restore instructions
> available for the kernel to use, depending on the generation of CPU.  I
> don't know for certain, but I strongly suspect that state saved in the
> checkpoint by one such instruction would not restore with a different one.
>
> The last two items are my best guess since you indicate you are using
> the same kernel.
>
> Good luck and please let me know if you learn anything more.
> If we can collect more info, I will update the FAQ entry about migration.
>
> -Paul
>
>
> [email protected] wrote:
>> Hi:
>>
>>  I've checkpointed/restarted jobs on different CPU's before for example:
>> I've checkpointed on a AMD processor and restarted on a xeon processor.
>>
>> It does not seem to work all the time however. I just did a checkpoint
>> on a XEON and
>> tried to restart on a AMD and I got a segmentation fault. trying to
>> restart the application.
>>
>> My question is under what  circumstances I can restart on a different
>> x64 CPU.
>> How can I build my code so I won't have problems with this. (Or should
>> it be working).
>>
>> I am using blcr 0.8.0 on  2.6.9-55.ELsmp kernels.
>>
>>
>> With Blessings
>> and Best regards,
>>
>> Jerry
>> 2363
>>
>> You shall do no unrighteousness in judgment; you shall not favor the
>> poor, nor favor the mighty; but in righteousness you shall judge your
>> neighbor.
>> (Torah portion, Kedoshim, Leviticus 19:15)
>
>
> --
> Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
> Future Technologies Group
> HPC Research Department                   Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>
>