-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Godror failing to handle enough concurrent connections. #192
Comments
What is the error (that's been omitted) ? What if I don't know whether it's the pool session acquiring/releasing or something else causes the error. Is that 20k concurrent connections really needed? |
A little background: So best is to have |
Hi, I didn't omit the error message, it just shows that with each failed request repeatedly when concurrency limit to set to So for the scenario we will have around So I have tried all the configuration you have mentioned, like set the I really appreciate your feedback, and giving me some direction. But at this point I'm a bit clueless. Is that something do with CGO or something? Also can you try reproducing the scenario? I can give you the needed code for the server and client. Thank you! |
Sorry, but that's not an error message, just a stack trace. |
So, I have set the
Then when I made around 20k request: I'll get something this like this repeatedly
After that it'll show the stackstrace like before. |
Doesn't it include other goroutine stacks or any additional error messages? |
@sudarshan12s Yes, it shows some other goroutine stacks at the end, let me post the other stackstrace. I didn't set the Thanks! |
After it showing the repeated stackstraces like above, it shows some other repeated stackstraces like this:
|
Sometimes the stackstrace is different though, like this time I got something like this:
Note I'm trying with different simplified golang server though where it only inserts into one table. |
Is this 20k CONCURRENT connection, or 20k sequential? |
Here is the client I'm trying for generating concurrent request although for simplified server I'm not sending the json data here. |
And here is the server and SQL table structure I'm trying server I have set the Thanks! |
Also it don't show these in the output |
request_server.mp4Here is the partial screen capture of server log. |
Why can't you just share that log somewhere? First, we have to find out where are those resources (dpiStmt, dpiConn) released, or why they aren't. A parallel experiment: can you run 20k requests sequentially? How many connections do they use? |
Few observations, i see: If you set MinSessions to 20000, DB.Ping() potentially create 20k connections. so in the logs this dpiPool_create would create potentially 20k connections. In your server program, I see 1000 , increment is made 1. If 20k connections (goroutines) concurrently request connections, increment is small, you can make it little larger or you increase MinSessions. MinSessions are created even before your webserver is on, so the subsequent goroutine requests should just get served with the connections already created in pool . You mentioned max sessions on server as 6k, this also needs to be increased as concurrent requests are higher. From traces all goroutines do concurrent connection create and have been assigned a separate OS thread which might be ok . If MinSessions is high, the goroutines creating new connection shall reduce. |
As I see in the video, DPI debug messages intermingle with Go error/stack messages. |
I was trying around 20k request mostly because that's where it fails and that log is so big I can't trace them all and with small request it does not create the problem but let me share the log for the 1/2 or few requests. Sequentially? I haven't tried for that many request, but will try that. |
This is what I get with one request with DPI_DEBUG_LEVEL=92, nothing else:
Same repeated result for two request. |
Even with 200 or 1000 request it shows the log as above. And it all get success response. |
@sudarshan12s I basically tried MinSessions with 20000, but 1000 put in the example file mistakenly. I have asked our DBA for increasing the maximum session to at least make it 30k. Should removing the Ping() will make any improvement? |
So, if I remove the db.Ping() part, I will get some error like this with even 1000 requests, but after adding it again 1000 requests works fine, which is a bit a strange to me.
|
keep the db.Ping, it basically causes pool to be created as part of single thread. if multiple goroutines try to create a pool, it can cause above. So db.ping before webserver start looks good. |
PTAL Tweak it for your environment (either use it with For my free Oracle ADB instance, it has inserted 8192 rows in 21s, using 16 sessions - concurrency was 8192. One difference is that I have AFAIK with 0 it won't wait for a connection to be available. |
Okay, I'll have to dig down to it. I'll come back to you soon. |
I can reproduce something similar when creating a separate connection pool to the same DB:
I can toggle it by leaving See line line 40 in |
Are you just it running from the test or trying with a client code with similar approach? I have ran the test and passed but did not see any concurrency number:
It's on my local machine though, I'm gonna try it on the vm now. Also will be trying those parameters accordingly. And see what happens. |
env GODROR_TEST_DSN="..." go test -run=Concurrency -v
env GODROR_CONC_NO_LIMIT=1 GODROR_TEST_DSN="..." go test -run=Concurrency
…________________________________
From: Monir Zaman ***@***.***>
Sent: Wednesday, October 20, 2021 6:23:54 PM
To: godror/godror ***@***.***>
Cc: Tamás Gulácsi ***@***.***>; Comment ***@***.***>
Subject: Re: [godror/godror] Godror failing to handle enough concurrent connections. (Issue #192)
Are you just it running from the test or trying with a client code with similar approach? I have ran the test and passed but did not see any concurrency number:
--- PASS: TestConcurrency (42.03s)
PASS
ok github.com/godror/godror 42.146s
It's on my local machine though, I'm gonna try it on the vm now. Also will be trying those parameters accordingly. And see what happens.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#192 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAB6QSR74CS2MCOAH2AW57DUH3UJVANCNFSM5GJ2TUAQ>.
Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
So running the test with this command
What does that indicate? |
Strange. With GODROR_CONC_NO_LIMIT=1 I get sigsegv, without it the test passes. Either way, try setting db.SetMaxConns to something not higher than poolMaxSessions. |
Okay, so with MaxSession and MinSession set to 100 I have been able to get to pretty close to 19k concurrent requests. No server side crashing though but got some connection reset by peer from client side. That's different though. P.PoolParams.SessionIncrement = 30
P.PoolParams.MaxSessions = 100
P.PoolParams.MinSessions = 100
P.PoolParams.WaitTimeout = time.Second * 5
// fmt.Println(P.StringWithPassword())
log.Println(P)
DB = sql.OpenDB(godror.NewConnector(P))
if err := DB.Ping(); err != nil {
log.Println(err)
log.Fatal(err, " dsn: ")
}
DB.SetMaxIdleConns(P.PoolParams.MinSessions)
DB.SetMaxOpenConns(P.PoolParams.MaxSessions) |
Although for high concurrency, I'd set it around the average (mininum?) number of concurrent requests. By tuning it, you may be able to gain latency, but release session back to Oracle less often. |
Adding some details on the parameters. when DB.SetMaxOpenConns is higher than the current connections in the pool, the goroutine requests wait till the pool expansion is done based on P.PoolParams.SessionIncrement limited by P.PoolParams.MaxSessions. So to avoid expansion of pool after pool creation, we can provide higher P.PoolParams.MinSessions but this is the size of pool retained at runtime. poolWaitTimeout - the time waited to get a new connection from pool. Default is 30s. It can cause ORA-24459 if all goroutines have taken up the connections and are busy and pool is expanding for the new goroutine request. poolSessionTimeout - used to terminate idle connections. Default is 5 Min. |
I think you are not seeing SIGSEGV . Still if you can upload entire logs to somewhere like, https://gist.github.com/ it helps identify that termination reason. |
Hi,
We have a server that can handle
60,000+
concurrent request but that doesn't hit the database. But with an API that excepts JSON data and insert into database, that fails with20,000
concurrent request. Although it works fine with 10k requests. I have also tried stop calling the db insert function with that API and it works fine.Now with around 20k request it throws an error like this, here is the stacktrace:
From one of your suggestion I found I tried with
context.WithTimeout(ctx, 1*time.Minute)
anddb.ExecContext()
for executing the insert query. The result is same.Here is my config for the Godror I'm using:
If you want to know a little bit of server config:
I'm currently trying with the client I have written in Golang, which has the http.Transport configuration like this:
I can show you the full client code if you want, but the thing is it successfully handles 10k+ requests, or I also tried tsung to generate the load, result is pretty much same. Our database is configured for 400k connections but the session is about 6k (Do we need to increase it?). Although no error from the database side.
And about the house data I'm sending it could be an array of house but right now I'm just only sending one house that has around 34 fields, mostly 1/2 digit integer value except for the uid field. But the insert query inserts into two tables one with a very few fields.
I have tried several ways to tweak the configuration and stuff and couldn't get it working. Our target is to reaching at least 60k concurrent requests. Is that a bug? Or What I might be doing wrong here? Can you give me any direction? Any help is appreciated!
Thank you!
The text was updated successfully, but these errors were encountered: