-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
T864 - Separate log file for unparsed queries #295
Conversation
} | ||
|
||
// NewAcraCensor creates new censor object. | ||
func NewAcraCensor() *AcraCensor { | ||
acraCensor := &AcraCensor{} | ||
acraCensor.logger = log.WithField("service", ServiceName) | ||
acraCensor.ignoreParseError = false | ||
acraCensor.parseErrorsLogPath = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need such initialization because it's already initialized with empty string. Above assigning is redundant too
if err != nil { | ||
return err | ||
} | ||
defer openedFile.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we doesn't use this file after creation we can close it here without defer. but better to store file descriptor as censor field to avoid opening every write operation and reduce count of syscalls (open/close)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When and where should I close it is such case when acraserver stops?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just here) openedFile.Close()
instead defer openedFile.Close()
because this file doesn't used anywhere after creation. As I understand here we just create file and check that we can do this and close it before next write.
But if we will store opened file in censor then we don't need .Close()
here even with defer
if acraCensor.parseErrorsLogPath != "" { | ||
err := acraCensor.saveQuery(rawQuery) | ||
if err != nil { | ||
acraCensor.logger.WithError(err).Errorf("An error occurred while saving unparsable query") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use Errorln
to append '\n' at end of log msg. Errorf
useful to create msg with some format of string
err := acraCensor.saveQuery(rawQuery) | ||
if err != nil { | ||
acraCensor.logger.WithError(err).Errorf("An error occurred while saving unparsable query") | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if censor configured with acracensor.ignoreParseError
then we should not stop processing on error related with unparsed query and enough log error and try to save without stopping processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vixentael what you think about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, if acracensor.ignoreParseError
is true, then log error and move forward
defer openedFile.Close() | ||
|
||
_, err = openedFile.WriteString(rawQuery + "\n") | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return err
enough imho :)
plus better to use opened file on loading configuration and use it to avoid open/close on every save
configs/acra-censor.example.yaml
Outdated
@@ -1,4 +1,5 @@ | |||
ignore_parse_error: false | |||
parse_errors_log: unparsed_queries_log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about uparsed_queries.log
name? to add some common extension to avoid meaning some binary file without extension on linux
|
||
defer openedFile.Close() | ||
|
||
_, err = openedFile.WriteString(rawQuery + "\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we will have same problem which we had with querycapture handler where we try to buffer data before dumping to file to avoid slow write on each query. here we will have same problem with situation when client will use not supported queries. plus we don't need repeated queries, only unique. maybe would be good to write some QueryWriter
(and use in querycapture if we can to avoid duplicated code) which will do this logic with buffered write and timeouts and store unique queries in memory to avoid repeated queries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will write additional handler for this (with functionality like QueryCapture)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, it makes sense to allocate same functionality into some object lke QueryWriter
and use it for both QueryCapture
and here
} | ||
|
||
// NewAcraCensor creates new censor object. | ||
func NewAcraCensor() *AcraCensor { | ||
acraCensor := &AcraCensor{} | ||
acraCensor.logger = log.WithField("service", ServiceName) | ||
acraCensor.ignoreParseError = false | ||
acraCensor.parsedQueriesWriter = nil | ||
acraCensor.unparsedQueriesWriter = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can skip it because its already nil
acra-censor/common/logging_logic.go
Outdated
// read existing queries from file | ||
err = handler.readAllQueriesFromFile() | ||
if err != nil { | ||
handler.logger.WithError(ErrCantReadQueriesFromFileError).WithField(logging.FieldKeyEventCode, logging.EventCodeErrorCensorIOError).Errorln("Can't create QueryWriter instance") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should log original error err
using WithError
acra-censor/common/logging_logic.go
Outdated
err = handler.readAllQueriesFromFile() | ||
if err != nil { | ||
handler.logger.WithError(ErrCantReadQueriesFromFileError).WithField(logging.FieldKeyEventCode, logging.EventCodeErrorCensorIOError).Errorln("Can't create QueryWriter instance") | ||
openedFile.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should process error to by logging it
acra-censor/common/logging_logic.go
Outdated
case <-handler.serializationTicker.C: | ||
err := handler.dumpBufferedQueriesToFile(openedFile) | ||
if err != nil { | ||
handler.logger.WithError(ErrComplexSerializationError).WithField(logging.FieldKeyEventCode, logging.EventCodeErrorCensorIOError).Errorln("Can't create QueryWriter instance") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log original error
acra-censor/common/logging_logic.go
Outdated
case <-signalBackgroundExit: | ||
err := handler.finishAndCloseFile(openedFile) | ||
if err != nil { | ||
handler.logger.WithError(ErrComplexSerializationError).WithField(logging.FieldKeyEventCode, logging.EventCodeErrorCensorIOError).Errorln("Can't create QueryWriter instance") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
acra-censor/common/logging_logic.go
Outdated
// DumpAllQueriesToFile writes stored queries into file. | ||
func (queryWriter *QueryWriter) DumpAllQueriesToFile() error { | ||
// open or create file, NO APPEND | ||
f, err := os.OpenFile(queryWriter.filePath, os.O_TRUNC|os.O_RDWR|os.O_CREATE, 0600) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to close file in function which open it with defer
acra-censor/common/logging_logic.go
Outdated
return nil | ||
} | ||
|
||
func (queryWriter *QueryWriter) appendQueries(queries []*QueryInfo, openedFile *os.File) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we don't use any os.File specific API then we can take io.Writer as param.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use openedFile.Write(lines)
in this function
acra-censor/common/logging_logic.go
Outdated
return linesToAppend | ||
} | ||
|
||
func (queryWriter *QueryWriter) finishAndCloseFile(openedFile *os.File) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://golang.org/pkg/io/#Closer enough here. interfaces better to use instead specific types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please, explain in more details
acra-censor/common/logging_logic.go
Outdated
signalBackgroundExit := make(chan bool) | ||
|
||
// create handler | ||
handler := &QueryWriter{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should assign openedFile
to the related writer which use it to not forget to close it on Release/Close. resource owner responsible for releasing resources. plus it allow to avoid extra OpenFile calls because file already opened
acra-censor/common/logging_logic.go
Outdated
handler.serializationTicker = time.NewTicker(handler.serializationTimeout) | ||
|
||
case <-signalBackgroundExit: | ||
err := handler.finishAndCloseFile(openedFile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imho better to call handler.Release
here and rename finishAndCloseFile
to Release
, which should dump data and close file
@@ -110,3 +103,39 @@ func (acraCensor *AcraCensor) HandleQuery(rawQuery string) error { | |||
acraCensor.logger.Infof("Allowed query: '%s'", queryWithHiddenValues) | |||
return nil | |||
} | |||
|
|||
// GetLoggingTimeout returns current timeout of censor's logging process | |||
func (acraCensor *AcraCensor) GetLoggingTimeout() time.Duration { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we doesn't need such methods because they will not called anywhere and logging timeout we set once at first censor configurations with some constant and it will not changed during acra-server run. so I think we don't need extend AcraCensorInterface. anyway it's specific settings for handler, not censor at all.
} | ||
|
||
// SetLoggingTimeout sets timeout of censor's logging process | ||
func (acraCensor *AcraCensor) SetLoggingTimeout(duration time.Duration) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
save as above
|
||
func saveQuery(writer *common.QueryWriter, query string) { | ||
//skip already captured queries | ||
for _, capturedQuery := range writer.Queries { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is logic of QueryWriter, not censor. better to move this logic to query writer and add some method like handler.WriteQuery(query string)
where it will check is this query exists or not and create new QueryInfo and append or not
after moving we will not need writer.Queries as public field and change to private
@@ -36,4 +39,6 @@ type AcraCensorInterface interface { | |||
AddHandler(handler QueryHandlerInterface) | |||
RemoveHandler(handler QueryHandlerInterface) | |||
ReleaseAll() | |||
GetLoggingTimeout() time.Duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imho we don't need it
// WriteAll writes raw queries that need to be stored to internal file | ||
func (storage *FileLogStorage) WriteAll(p []byte) error { | ||
if storage.openedToAppend { | ||
openedFile, err := os.OpenFile(storage.filePath, os.O_TRUNC|os.O_RDWR|os.O_CREATE, 0600) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't use already opened file in our constructor. we didn't close it and didn't reuse. I prefer to reuse:
if err := storage.file.Truncate(0); err != nil {return err} // drop all data but leave file already opened and ready to write
|
||
// DumpQueries writes all queries into file. | ||
func (queryWriter *QueryWriter) DumpQueries() error { | ||
rawData := queryWriter.serializeQueries(queryWriter.Queries) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we need lock to avoid background append to Queries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after serialization, we can unlock and avoid lock on writing to file because we have serialized copy of data
acra-censor/common/logging_logic.go
Outdated
// GetAllInputQueries returns a list of non-masked RawQueries. | ||
func (queryWriter *QueryWriter) GetAllInputQueries() []string { | ||
var queries []string | ||
for _, queryInfo := range queryWriter.Queries { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we need Rlock. we read shareable resource that may be changed during read. and go runtime will panic on loop through map that was changed at this time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please explain what is Rlock? I googled but only usual Mutex.lock found
acra-censor/common/logging_logic.go
Outdated
// It will be written to file on Stop, Reset or Release. | ||
// Expects redacted query | ||
func (queryWriter *QueryWriter) MarkQueryAsForbidden(queryWithHiddenValues string) { | ||
for index, queryInfo := range queryWriter.Queries { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here too RLock
queryInfo := &QueryInfo{} | ||
queryInfo.RawQuery = queryWithHiddenValues | ||
queryInfo.IsForbidden = false | ||
queryWriter.Queries = append(queryWriter.Queries, queryInfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lock on write
|
||
func (acraCensor *AcraCensor) saveUnparsedQuery(query string) { | ||
if acraCensor.unparsedQueriesWriter != nil { | ||
saveQuery(acraCensor.unparsedQueriesWriter, query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imho better to use unparsedQueriesWriter.WriteQuery(query)
or move saveQuery
to file with this handler
@@ -69,7 +69,7 @@ func NewFileQueryWriter(filePath string) (*QueryWriter, error) { | |||
// read existing queries | |||
err = writer.readStoredQueries() | |||
if err != nil { | |||
writer.logger.WithError(ErrCantReadQueriesFromFileError).WithField(logging.FieldKeyEventCode, logging.EventCodeErrorCensorIOError).Errorln("Can't create QueryWriter instance") | |||
writer.logger.WithError(ErrCantReadQueriesFromStorageError).WithField(logging.FieldKeyEventCode, logging.EventCodeErrorCensorIOError).Errorln("Can't create QueryWriter instance") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should log origin error too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be logged from readStoredQueries function
|
||
// DumpQueries writes all queries into file. | ||
func (queryWriter *QueryWriter) DumpQueries() error { | ||
rawData := queryWriter.serializeQueries(queryWriter.Queries) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after serialization, we can unlock and avoid lock on writing to file because we have serialized copy of data
acra-censor/common/logging_logic.go
Outdated
@@ -134,13 +127,13 @@ func (queryWriter *QueryWriter) Release() { | |||
} | |||
|
|||
func (queryWriter *QueryWriter) reset() { | |||
queryWriter.mutex.Lock() | |||
defer queryWriter.mutex.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defer
good to use if we have a lot of points of exit to avoid multiple calls of Unlock
. For example
func t(){
l.Lock()
_, err := func1()
if err != nil{
l.Unlock()
return
}
_, err := func2()
if err != nil{
l.Unlock()
return
}
_, err := func3()
if err != nil{
l.Unlock()
return
}
}
In such case to not forget unlock and call it once we should use defer l.Unlock
and it simple
But defer
has a small overhead. It's really small but it have. And when we have simple cases such this we can avoid it without any care that we can forget
} | ||
storage.file = openedFile | ||
storage.openedToAppend = false | ||
if err := storage.file.Truncate(0); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks better)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I just didn't know that Truncate method exists
@@ -7,7 +7,6 @@ import ( | |||
|
|||
// FileLogStorage is a file-based implementation of LogStorage interface | |||
type FileLogStorage struct { | |||
filePath string | |||
file *os.File | |||
openedToAppend bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need it more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no it seems we didn't. I used it when reopening file with other modes set, but now Truncate allows to avoid it
|
||
// Release dumps all Captured queries to file and resets Captured queries list. | ||
func (queryWriter *QueryWriter) Release() { | ||
queryWriter.DumpQueries() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should handle error and log about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DumpQueries logs errors inside
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then would be better to assign error to anonym variable to be compliant with linters (goland in last versions highlight such code with unhandled error
message :) plus it's good to log twice. with one more log we add some context where this function was called (DumpQueries is public function that may be called in a lot of places in a future who will use this writer). Here we will have error on Release
, not on serialization by timeout
t.Fatal(err) | ||
} | ||
} | ||
time.Sleep(DefaultSerializationTimeout + 100*time.Millisecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in previous PR's we decreased time of tests twice just by decreasing granulation of timeouts. and there we can't it handle more deterministic because we can't control environment fully
but here in unit-tests we can. what about to move background goroutine in our constructor NewFileQueryWriter
to some function like start()
and rewrite like something:
func (writer *QueryWriter) start(){
for {
select {
case <-writer.serializationTicker.C:
err := writer.dumpBufferedQueries()
if err != nil {
writer.logger.WithError(err).WithField(logging.FieldKeyEventCode, logging.EventCodeErrorCensorIOError).Errorln("Can't dump buffered queries")
}
writer.serializationTicker.Stop()
writer.serializationTicker = time.NewTicker(writer.serializationTimeout)
case <-signalBackgroundExit:
writer.serializationTicker.Stop()
err := writer.logStorage.Close()
if err != nil {
writer.logger.WithError(err).WithField(logging.FieldKeyEventCode, logging.EventCodeErrorCensorIOError).Errorln("Error occurred on exit QueryWriter instance")
}
case <-signalShutdown:
writer.serializationTicker.Stop()
err := writer.logStorage.Close()
if err != nil {
writer.logger.WithError(err).WithField(logging.FieldKeyEventCode, logging.EventCodeErrorCensorIOError).Errorln("Error occurred on shutdown QueryWriter instance")
}
return
default:
//do nothing. This means that channel has no data to read yet
}
}
}
// NewFileQueryWriter creates QueryWriter instance
func NewFileQueryWriter(filePath string) (*QueryWriter, error) {
writer := &QueryWriter{}
// initialization logic
go writer.Start()
return writer, nil
}
after that we can in our tests do:
writer, err := NewFileQueryWriter(tmpFile.Name())
for _, query := range testQueries {
_, err = writer.RedactAndCheckQuery(query)
if err != nil {
t.Fatal(err)
}
}
writer.Release() // or writer.DumpQueries() or any other private method
go writer.start() // here we restart our background goroutine
plus we need this background goroutine only to test logic with background serialization and stop. in other tests (dumping/serialization/deserialization) we can test without it, just call direct methods with writer.dumpBlaBla
, writer.deserializeBlaBla
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix last comments (tests should not pass gofmt check) and ok to merge
acra-censor/common/logging_logic.go
Outdated
@@ -52,26 +52,24 @@ type QueryWriter struct { | |||
// NewFileQueryWriter creates QueryWriter instance | |||
func NewFileQueryWriter(filePath string) (*QueryWriter, error) { | |||
// create writer | |||
writer := &QueryWriter{} | |||
writer := &QueryWriter{ | |||
queryIndex:0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not sure, but 90% that this file not formatted with gofmt )
} | ||
return nil | ||
}) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return err
: )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's output value of internal function that passed as argument to WalkQueries
No description provided.