History | Log In     View a printable version of the current page. Get help!  
Issue Details [XML]

Key: CHD-146
Type: Improvement Improvement
Status: Open Open
Assignee: Unassigned
Reporter: Jeff Standen [WGM]
Votes: 7
Watchers: 4
Operations

Clone this issue
If you were logged in you would be able to see more operations.
Cerberus Helpdesk

Feature to purge contact addresses that have no tickets and aren't associated with a company

Created: 21/Aug/07 06:37 PM   Updated: 26/Jan/11 11:19 PM
Fix Version/s: Catch and Release

Original Estimate: Unknown Remaining Estimate: Unknown Time Spent: Unknown
Issue Links:
Related To
 
This issue is related to:
CHD-1112 Add options for selecting which addre... Open


 Description   
Query to find these:
SELECT a.id,count(m.id) as hits FROM address a LEFT JOIN message m ON (a.id=m.address_id) WHERE a.contact_org_id = 0 GROUP BY a.id HAVING hits = 0

We then need to iterate through the results and use DAO_Address to nuke the rows.

In our live helpdesk this can be well over 100,000 addresses due to spam messages that were purged. This should be able to timeout gracefully.

 All   Comments   Work Log   Change History      Sort Order:
Comment by Dan Hildebrandt [WGM] [17/Dec/07 01:25 PM]

Comment by Dan Hildebrandt [WGM] [17/Dec/07 01:28 PM]
Pancho also requests some form of attachment cleanup. (same thread)

Comment by Dan Hildebrandt [WGM] [19/Dec/07 09:54 AM]

Comment by Joe Geck [04/Mar/09 04:33 PM]
May want to update this thread when this or CHD-1112 is completed. Both deal with address deletion (or how to better hide unwanted addresses).

http://www.cerb4.com/forums/showthread.php?t=311

Comment by Sam Ravenscroft [14/Jun/09 01:28 PM]
An improvement on the query shown above:

SELECT a.id,
count(m.id) as hits,
count(r.address_id) as req_hits,
a.email,
a.contact_org_id
FROM address a LEFT JOIN message m ON (a.id=m.address_id)
LEFT JOIN requester r on (r.address_id = a.id)
WHERE a.contact_org_id = 0
GROUP BY a.id
HAVING hits = 0 AND req_hits = 0
ORDER BY req_hits DESC

This makes sure that you are not inadvertently removing an address that is listed as a requester on a ticket.

By removing the line WHERE a.contact_org_id = 0 you can also remove addresses that have an organization assigned to them, but no tickets.

However, I would not remove that as there is probably a reason why the contact (address) has an organization assigned to it.

Comment by Sam Ravenscroft [14/Jun/09 01:47 PM]
And if you're happy with the results from the above query you can execute the following query to delete the addresses:

DELETE FROM address where id in (
SELECT ID from (
SELECT a.id,
count(m.id) as hits,
count(r.address_id) as req_hits,
a.email,
a.contact_org_id
FROM address a LEFT JOIN message m ON (a.id=m.address_id)
LEFT JOIN requester r on (r.address_id = a.id)
WHERE a.contact_org_id = 0
GROUP BY a.id
HAVING hits = 0 AND req_hits = 0
ORDER BY req_hits DESC
) AS d
)

Comment by Robert Middleswarth [14/Jun/09 03:10 PM]
This would be really useful in the maintenance task.

Thanks
Robert

Comment by Robert Middleswarth [14/Jun/09 03:15 PM]
Well testing the query on my 200,000 ticket DB it doesn't seems to like this query very much. I am guessing that limits are going to have to be added to the query to limit the result set.

Thanks
Robert

Comment by Sam Ravenscroft [14/Jun/09 03:21 PM]
I'm no expert, but the query worked for me.

I don't know if this is relevant, but we are using Cerberus v.4.2.1 stable, build 914

Did it help when you added limits?

Comment by Robert Middleswarth [14/Jun/09 03:29 PM]
Actually I am still using 891 v4.1. But it isn't the query is error out it is that the query is timing out. I think I have to many records for the query to complete.

Thanks
Robert

Comment by Robert Middleswarth [05/Sep/09 10:17 AM]
Sam,

I am working on this some this weekend turning this into a plugin so this can be fixed for good. I have a question? Why is your select including a.email,
a.contact_org_id when they are not getting used?

Thanks
Robert

Comment by Robert Middleswarth [05/Sep/09 10:51 AM]
Warning don't run this query if you have a lot of messages in your system. It will grind you system into the ground.

Thanks
Robert

Comment by Sam Ravenscroft [05/Sep/09 10:59 AM]
Hi Robert,

I used the extra select statements in the query as I was writing the query and simply used it so that I could manually go through the results for comparison.

I just forgot to take them out when I wrapped the query in a DELETE statement... sorry about this.

Also - we have a relatively small database, so I did not pick up that the query was slow.

Again - sorry about that, hope you can find a better query for that.

I would be interested in looking at the plugin when complete.

Regards,

Sam

Comment by Robert Middleswarth [05/Sep/09 11:41 AM]
Sam,

Although I will likely rewrite using sub query they seem to be a lot faster you might want to note these changes to the base query. Mysql will do a lot less work in this case.

DELETE FROM address where id in (
SELECT a.id
FROM address a
LEFT JOIN message m ON a.id = m.address_id
LEFT JOIN requester r ON a.id = r.address_id
WHERE a.contact_org_id = 0
AND m.address_id IS NULL
AND r.address_id IS NULL
ORDER BY a.id DESC
)

Comment by Robert Middleswarth [05/Sep/09 12:13 PM]
That last query the select section ran with the following on my live site.

8,072 total, Query took 859.6044 sec

Were as

SELECT a.id,
count(m.id) as hits,
count(r.address_id) as req_hits,
a.email,
a.contact_org_id
FROM address a LEFT JOIN message m ON (a.id=m.address_id)
LEFT JOIN requester r on (r.address_id = a.id)
WHERE a.contact_org_id = 0
GROUP BY a.id
HAVING hits = 0 AND req_hits = 0
ORDER BY req_hits DESC

Would time out.

Comment by Robert Middleswarth [06/Sep/09 11:56 AM]
It is a pretty cool change when you add an index to message.address_id. Query time before.

8,072 total, Query took 859.6044 sec

Query time after.

8,074 total, Query took 0.0203 sec

Now I need to double check something there was a change in 4.3 that might break this query need to double check.

Thanks
Robert

Comment by Robert Middleswarth [06/Sep/09 03:27 PM]
OK I have it all working. I am creating the wiki stuff for it right now.

http://wiki.cerb4.com/wiki/Purge_Contact_Addresses

Enjoy.

Thanks
Robert

Comment by Joe Geck [14/Dec/09 09:09 AM]