Issue with daily re-connect & NextExpectedMsgSeqNum off by 1

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Issue with daily re-connect & NextExpectedMsgSeqNum off by 1

QuickFIX/J mailing list
QuickFIX/J Documentation: http://www.quickfixj.org/documentation/
QuickFIX/J Support: http://www.quickfixj.org/support/



Actually what you did in the code below is what I meant by "wait for the lock to be released".
To do it with a timeout you would need tryLock(long, TimeUnit), yes.

Chris.


On 04/07/17 14:52, Freedman, Jon wrote:

I think the correct fix is to make sure that any change to SessionState is atomic, there appears to be an attempt to do this by surrounding Session#sendRaw with lockSenderMsgSeqNum

 

Are you suggesting that rather than change Session#nextLogon to:

 

            state.lockSenderMsgSeqNum();

            final int actualNextNum = state.getMessageStore().getNextSenderMsgSeqNum();

            state.unlockSenderMsgSeqNum();

 

Refactor SessionState to use Lock#tryLock instead of Lock#lock and add time/unit parameters to SessionState#lockSenderMsgSeqNum ?

 

See https://github.com/quickfix-j/quickfixj/compare/master...jonfreedman:master

 

From: Christoph John [[hidden email]]
Sent: 04 July 2017 12:32
To: Freedman, Jon; '[hidden email]'
Subject: Re: [Quickfixj-users] Issue with daily re-connect & NextExpectedMsgSeqNum off by 1

 

Hi Jon,

what about my suggestion from an earlier mail:

So one could simply wait for the lock to be released to read the correct value. Since this is only done rarely (only on Logon and only when tag 789 is used) it might be a feasible solution. But I'll think a while more about it.


Do you have a more elegant solution? ;) Until now I don't have a more sophisticated solution. And it probably is the only thing you can do.
But I'd rather not wait indefinitely for the lock to be released. Question is what happens when after a given time (1 second??) the lock cannot be obtained? Abort the Logon or simply obtain the sequence number without the lock with knowing that it might be wrong?

Chris.


On 04/07/17 12:53, Freedman, Jon wrote:

The counterparty have confirmed they should have sent a gap fill but did not.  They’re looking into correcting that.

 

In the meantime is it possible to work on fixing the race condition?  I am happy to help with that.

 

From: Christoph John [[hidden email]]
Sent: 27 June 2017 11:08
To: Freedman, Jon; '[hidden email]'
Subject: Re: [Quickfixj-users] Issue with daily re-connect & NextExpectedMsgSeqNum off by 1

 

There was a resend request implicitly sent via the usage of tag 789 on the Logon message that QFJ sent (there is 789=1).
Now the counterparty should respond with either the resent messages or a SequenceReset message. I can see neither in the logs.

Chris.


--
Christoph John
Development & Support
Direct: +49 241 557080-28
Mailto:Christoph.John@...



http://www.macd.com


MACD GmbH
Oppenhoffallee 103
D-52066 Aachen
Tel: +49 241 557080-0 | Fax: +49 241 557080-10
 Amtsgericht Aachen: HRB 8151
Ust.-Id: DE 813021663

Geschäftsführer: George Macdonald


take care of the environment - print only if necessary

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Quickfixj-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quickfixj-users
Reply | Threaded
Open this post in threaded view
|

Re: Issue with daily re-connect & NextExpectedMsgSeqNum off by 1

Freedman, Jon
QuickFIX/J Documentation: http://www.quickfixj.org/documentation/
QuickFIX/J Support: http://www.quickfixj.org/support/



I’ve created a pull request with this change @ https://github.com/quickfix-j/quickfixj/pull/118, it would be great if we could get a version built with this included and potentially enhance further to make use of Lock#tryLock as a second step?

 

From: Christoph John [mailto:[hidden email]]
Sent: 04 July 2017 14:05
To: Freedman, Jon; '[hidden email]'
Subject: Re: [Quickfixj-users] Issue with daily re-connect & NextExpectedMsgSeqNum off by 1

 

Actually what you did in the code below is what I meant by "wait for the lock to be released".
To do it with a timeout you would need tryLock(long, TimeUnit), yes.

Chris.

On 04/07/17 14:52, Freedman, Jon wrote:

I think the correct fix is to make sure that any change to SessionState is atomic, there appears to be an attempt to do this by surrounding Session#sendRaw with lockSenderMsgSeqNum

 

Are you suggesting that rather than change Session#nextLogon to:

 

            state.lockSenderMsgSeqNum();

            final int actualNextNum = state.getMessageStore().getNextSenderMsgSeqNum();

            state.unlockSenderMsgSeqNum();

 

Refactor SessionState to use Lock#tryLock instead of Lock#lock and add time/unit parameters to SessionState#lockSenderMsgSeqNum ?

 

See https://github.com/quickfix-j/quickfixj/compare/master...jonfreedman:master

 

From: Christoph John [[hidden email]]
Sent: 04 July 2017 12:32
To: Freedman, Jon; '[hidden email]'
Subject: Re: [Quickfixj-users] Issue with daily re-connect & NextExpectedMsgSeqNum off by 1

 

Hi Jon,

what about my suggestion from an earlier mail:


So one could simply wait for the lock to be released to read the correct value. Since this is only done rarely (only on Logon and only when tag 789 is used) it might be a feasible solution. But I'll think a while more about it.


Do you have a more elegant solution? ;) Until now I don't have a more sophisticated solution. And it probably is the only thing you can do.
But I'd rather not wait indefinitely for the lock to be released. Question is what happens when after a given time (1 second??) the lock cannot be obtained? Abort the Logon or simply obtain the sequence number without the lock with knowing that it might be wrong?

Chris.



On 04/07/17 12:53, Freedman, Jon wrote:

The counterparty have confirmed they should have sent a gap fill but did not.  They’re looking into correcting that.

 

In the meantime is it possible to work on fixing the race condition?  I am happy to help with that.

 

From: Christoph John [[hidden email]]
Sent: 27 June 2017 11:08
To: Freedman, Jon; '[hidden email]'
Subject: Re: [Quickfixj-users] Issue with daily re-connect & NextExpectedMsgSeqNum off by 1

 

There was a resend request implicitly sent via the usage of tag 789 on the Logon message that QFJ sent (there is 789=1).
Now the counterparty should respond with either the resent messages or a SequenceReset message. I can see neither in the logs.

Chris.

 

--

Christoph John
Development & Support
Direct: +49 241 557080-28
Mailto:Christoph.John@...




http://www.macd.com



MACD GmbH
Oppenhoffallee 103
D-52066 Aachen
Tel: +49 241 557080-0 | Fax: +49 241 557080-10

 Amtsgericht Aachen: HRB 8151
Ust.-Id: DE 813021663

Geschäftsführer: George Macdonald



take care of the environment - print only if necessary




This email, the information therein and any attached materials (collectively the "Email") are intended only for the addressee(s) and may contain confidential, proprietary, copyrighted and/or privileged material. If you have received this Email in error please delete it and notify the sender immediately. This Email remains the property of Brevan Howard, which reserves the right to require its return (together with any copies or extracts thereof) at any time upon request. Any unauthorised review, retransmission, dissemination, forwarding, printing, copying or other use of this Email is prohibited. Brevan Howard may be legally required to review and retain outgoing and incoming email and produce it to regulatory authorities and others with legal rights to the information. Internet communications cannot be guaranteed to be secure or error free as information could be intercepted, changed corrupted, lost, arrive late or contain viruses. Brevan Howard accepts no liability for any errors or omissions in this Email which arise as a result of internet transmission. This Email is not an official confirmation of any transaction. Any comments or statements made herein do not necessarily reflect the views of Brevan Howard. 
This Email is not an offer to sell or solicitation of an offer to buy any security or investment. It does not constitute or contain any investment advice and is being made without regard to the recipients investment objectives, financial situation or means. Past Performance is not an indicator of future results and Brevan Howard provides no assurance that future results will be consistent with any information provided herein or attached hereto. Brevan Howard and the sender make no warranties regarding the accuracy or completeness of the information in this Email and it should not be relied upon and is subject to change without notice. Brevan Howard and its representatives, officers and employees accept no responsibility for any losses suffered as a result of reliance on the information in this Email or the reliability, accuracy, or completeness thereof.
In this Email, "Brevan Howard" means Brevan Howard Asset Management LLP ("BHAM"), Brevan Howard Inc., Brevan Howard (Israel) Ltd and their respective affiliates. BHAM is a limited liability partnership authorised and regulated by the Financial Conduct Authority of the United Kingdom and registered in England & Wales (reg. no. OC302636).

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Quickfixj-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quickfixj-users
Reply | Threaded
Open this post in threaded view
|

Re: Issue with daily re-connect & NextExpectedMsgSeqNum off by 1

QuickFIX/J mailing list
QuickFIX/J Documentation: http://www.quickfixj.org/documentation/
QuickFIX/J Support: http://www.quickfixj.org/support/



Hi,

I think this can be done in 1.6.4. Feel free to add the tryLock in the same or another PR. We just need to think about what will happen if tryLock does not succeed after a predefined time. There is a Logon timeout that we probably should not exceed (IIRC 2 seconds).
But maybe I am just anxious and we can live without tryLock... ;)

Cheers,
Chris.


On 04/07/17 17:13, Freedman, Jon wrote:

I’ve created a pull request with this change @ https://github.com/quickfix-j/quickfixj/pull/118, it would be great if we could get a version built with this included and potentially enhance further to make use of Lock#tryLock as a second step?

 

From: Christoph John [[hidden email]]
Sent: 04 July 2017 14:05
To: Freedman, Jon; '[hidden email]'
Subject: Re: [Quickfixj-users] Issue with daily re-connect & NextExpectedMsgSeqNum off by 1

 

Actually what you did in the code below is what I meant by "wait for the lock to be released".
To do it with a timeout you would need tryLock(long, TimeUnit), yes.

Chris.

On 04/07/17 14:52, Freedman, Jon wrote:

I think the correct fix is to make sure that any change to SessionState is atomic, there appears to be an attempt to do this by surrounding Session#sendRaw with lockSenderMsgSeqNum

 

Are you suggesting that rather than change Session#nextLogon to:

 

            state.lockSenderMsgSeqNum();

            final int actualNextNum = state.getMessageStore().getNextSenderMsgSeqNum();

            state.unlockSenderMsgSeqNum();

 

Refactor SessionState to use Lock#tryLock instead of Lock#lock and add time/unit parameters to SessionState#lockSenderMsgSeqNum ?

 

See https://github.com/quickfix-j/quickfixj/compare/master...jonfreedman:master

 

From: Christoph John [[hidden email]]
Sent: 04 July 2017 12:32
To: Freedman, Jon; '[hidden email]'
Subject: Re: [Quickfixj-users] Issue with daily re-connect & NextExpectedMsgSeqNum off by 1

 

Hi Jon,

what about my suggestion from an earlier mail:


So one could simply wait for the lock to be released to read the correct value. Since this is only done rarely (only on Logon and only when tag 789 is used) it might be a feasible solution. But I'll think a while more about it.


Do you have a more elegant solution? ;) Until now I don't have a more sophisticated solution. And it probably is the only thing you can do.
But I'd rather not wait indefinitely for the lock to be released. Question is what happens when after a given time (1 second??) the lock cannot be obtained? Abort the Logon or simply obtain the sequence number without the lock with knowing that it might be wrong?

Chris.



On 04/07/17 12:53, Freedman, Jon wrote:

The counterparty have confirmed they should have sent a gap fill but did not.  They’re looking into correcting that.

 

In the meantime is it possible to work on fixing the race condition?  I am happy to help with that.

 

From: Christoph John [[hidden email]]
Sent: 27 June 2017 11:08
To: Freedman, Jon; '[hidden email]'
Subject: Re: [Quickfixj-users] Issue with daily re-connect & NextExpectedMsgSeqNum off by 1

 

There was a resend request implicitly sent via the usage of tag 789 on the Logon message that QFJ sent (there is 789=1).
Now the counterparty should respond with either the resent messages or a SequenceReset message. I can see neither in the logs.

Chris.

 


--
Christoph John
Development & Support
Direct: +49 241 557080-28
Mailto:Christoph.John@...



http://www.macd.com


MACD GmbH
Oppenhoffallee 103
D-52066 Aachen
Tel: +49 241 557080-0 | Fax: +49 241 557080-10
 Amtsgericht Aachen: HRB 8151
Ust.-Id: DE 813021663

Geschäftsführer: George Macdonald


take care of the environment - print only if necessary

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Quickfixj-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quickfixj-users
Reply | Threaded
Open this post in threaded view
|

Re: Issue with daily re-connect & NextExpectedMsgSeqNum off by 1

Freedman, Jon
QuickFIX/J Documentation: http://www.quickfixj.org/documentation/
QuickFIX/J Support: http://www.quickfixj.org/support/



1.6 PR: https://github.com/quickfix-j/quickfixj/pull/120

1.7 PR: https://github.com/quickfix-j/quickfixj/pull/118

 

Cheers

 

Jon

 

From: Christoph John [mailto:[hidden email]]
Sent: 07 July 2017 09:00
To: Freedman, Jon; '[hidden email]'
Subject: Re: [Quickfixj-users] Issue with daily re-connect & NextExpectedMsgSeqNum off by 1

 

Hi,

I think this can be done in 1.6.4. Feel free to add the tryLock in the same or another PR. We just need to think about what will happen if tryLock does not succeed after a predefined time. There is a Logon timeout that we probably should not exceed (IIRC 2 seconds).
But maybe I am just anxious and we can live without tryLock... ;)

Cheers,
Chris.

On 04/07/17 17:13, Freedman, Jon wrote:

I’ve created a pull request with this change @ https://github.com/quickfix-j/quickfixj/pull/118, it would be great if we could get a version built with this included and potentially enhance further to make use of Lock#tryLock as a second step?

 

From: Christoph John [[hidden email]]
Sent: 04 July 2017 14:05
To: Freedman, Jon; '[hidden email]'
Subject: Re: [Quickfixj-users] Issue with daily re-connect & NextExpectedMsgSeqNum off by 1

 

Actually what you did in the code below is what I meant by "wait for the lock to be released".
To do it with a timeout you would need tryLock(long, TimeUnit), yes.

Chris.


On 04/07/17 14:52, Freedman, Jon wrote:

I think the correct fix is to make sure that any change to SessionState is atomic, there appears to be an attempt to do this by surrounding Session#sendRaw with lockSenderMsgSeqNum

 

Are you suggesting that rather than change Session#nextLogon to:

 

            state.lockSenderMsgSeqNum();

            final int actualNextNum = state.getMessageStore().getNextSenderMsgSeqNum();

            state.unlockSenderMsgSeqNum();

 

Refactor SessionState to use Lock#tryLock instead of Lock#lock and add time/unit parameters to SessionState#lockSenderMsgSeqNum ?

 

See https://github.com/quickfix-j/quickfixj/compare/master...jonfreedman:master

 

From: Christoph John [[hidden email]]
Sent: 04 July 2017 12:32
To: Freedman, Jon; '[hidden email]'
Subject: Re: [Quickfixj-users] Issue with daily re-connect & NextExpectedMsgSeqNum off by 1

 

Hi Jon,

what about my suggestion from an earlier mail:



So one could simply wait for the lock to be released to read the correct value. Since this is only done rarely (only on Logon and only when tag 789 is used) it might be a feasible solution. But I'll think a while more about it.


Do you have a more elegant solution? ;) Until now I don't have a more sophisticated solution. And it probably is the only thing you can do.
But I'd rather not wait indefinitely for the lock to be released. Question is what happens when after a given time (1 second??) the lock cannot be obtained? Abort the Logon or simply obtain the sequence number without the lock with knowing that it might be wrong?

Chris.




On 04/07/17 12:53, Freedman, Jon wrote:

The counterparty have confirmed they should have sent a gap fill but did not.  They’re looking into correcting that.

 

In the meantime is it possible to work on fixing the race condition?  I am happy to help with that.

 

From: Christoph John [[hidden email]]
Sent: 27 June 2017 11:08
To: Freedman, Jon; '[hidden email]'
Subject: Re: [Quickfixj-users] Issue with daily re-connect & NextExpectedMsgSeqNum off by 1

 

There was a resend request implicitly sent via the usage of tag 789 on the Logon message that QFJ sent (there is 789=1).
Now the counterparty should respond with either the resent messages or a SequenceReset message. I can see neither in the logs.

Chris.

 

 

--

Christoph John
Development & Support
Direct: +49 241 557080-28
Mailto:Christoph.John@...




http://www.macd.com



MACD GmbH
Oppenhoffallee 103
D-52066 Aachen
Tel: +49 241 557080-0 | Fax: +49 241 557080-10

 Amtsgericht Aachen: HRB 8151
Ust.-Id: DE 813021663

Geschäftsführer: George Macdonald



take care of the environment - print only if necessary




This email, the information therein and any attached materials (collectively the "Email") are intended only for the addressee(s) and may contain confidential, proprietary, copyrighted and/or privileged material. If you have received this Email in error please delete it and notify the sender immediately. This Email remains the property of Brevan Howard, which reserves the right to require its return (together with any copies or extracts thereof) at any time upon request. Any unauthorised review, retransmission, dissemination, forwarding, printing, copying or other use of this Email is prohibited. Brevan Howard may be legally required to review and retain outgoing and incoming email and produce it to regulatory authorities and others with legal rights to the information. Internet communications cannot be guaranteed to be secure or error free as information could be intercepted, changed corrupted, lost, arrive late or contain viruses. Brevan Howard accepts no liability for any errors or omissions in this Email which arise as a result of internet transmission. This Email is not an official confirmation of any transaction. Any comments or statements made herein do not necessarily reflect the views of Brevan Howard. 
This Email is not an offer to sell or solicitation of an offer to buy any security or investment. It does not constitute or contain any investment advice and is being made without regard to the recipients investment objectives, financial situation or means. Past Performance is not an indicator of future results and Brevan Howard provides no assurance that future results will be consistent with any information provided herein or attached hereto. Brevan Howard and the sender make no warranties regarding the accuracy or completeness of the information in this Email and it should not be relied upon and is subject to change without notice. Brevan Howard and its representatives, officers and employees accept no responsibility for any losses suffered as a result of reliance on the information in this Email or the reliability, accuracy, or completeness thereof.
In this Email, "Brevan Howard" means Brevan Howard Asset Management LLP ("BHAM"), Brevan Howard Inc., Brevan Howard (Israel) Ltd and their respective affiliates. BHAM is a limited liability partnership authorised and regulated by the Financial Conduct Authority of the United Kingdom and registered in England & Wales (reg. no. OC302636).

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Quickfixj-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quickfixj-users
12